Resource Verification of Lazy Evaluation and Memoization

- Ravichandhran Madhavan, Sumith Kulal, Viktor Kuncak

11th February, 2017

Static Analysis

Analysing the program without executing it

Why do you need it?

To find *hidden* bugswhich might get revealed only after months into production

Dev tools ftw

Motivation

Consider an example:

Side channel attacks

Image credits: https://www.tau.ac.il/~tromer/acoustic/img/nobody-listens3.jpg

A timing attack watches data movement into and out of the CPU or memory on the hardware running the cryptosystem or algorithm.

Motivation

Embedded systems - One wants to use hardware that is just good enough to accomplish a task in order to produce a large number of units at lowest possible cost.
Hard real-time systems - One needs to guarantee specific worst-case running times to ensure the safety of the system. [1]

[1]: Multivariate amortized resource analysis, ACM TOPLAS 2012

Introduction

We propose a system for specifying and verifying resource bounds

For functional programs that use recursive data-structures
Meant for verifying precise bounds

Specifying Resource Bounds

Natural to specify as templates : expressions with numerical holes

traverse(t: Tree): Int = {
     …
} ensuring(time <= a*size(t)+b &&
        parallel-time <= a*height(t)+b)

a and b are numerical holes
size and height are recursive functions

Resource Verification Problem

Specifying Resource Bounds

The Problem

The values yield a valid bound for the resource
The bound is as strong as possible for the given template

Infer values for the numerical holes such that

Our Tool

Orb

Big O

resource

bounds

Contributions

Recursive functions
Algebraic data-types
Nonlinearity

A system for solving resource bound templates

Implementation and application to sequential and parallel execution time bounds

An algorithm for solving ∃∀ formulas with

Crux is the instrumentation

traverse(t: Tree): Int = {
     body
} ensuring(time <= a*size(t)+b)

traverse(t: Tree): (Int, Int) = {
     (body, resource-usage)
} ensuring(res._2 <=a*size(t)+b)

Instrumentation

Verification Condition (VC) Generation

f(x) = {
  require(pre)
     body
} ensuring(post)

∀x. ϕpre ∧ ϕbody ⟹ ϕpost

VCs with free variables

traverse(t: Tree) = {
   …
} ensuring(res._2 <= a*size(t)+b))

Postconditions contain numerical holes
They become free variables in the VCs

Goal: Solve for free variables in VCs

Orb algorithm

[R. Madhavan & V. Kuncak, CAV ’14]

Bounds Inferred by the Tool

Benchmark	Bound inferred
AVL tree	145*height(t) + 19
Red-Black tree	178*blackHeight(t) + 96
Binomial heap - deleteMin	70*treenum(h1) + 31*minchildren(h2) + 22
Leftist heap - merge	22*rheight(h1) + 22*rheight(h2) + 1
Insertion sort	8 * size(l) * size(l) + 2

Wall clock time vs. steps

Lazy evaluation and memoization

The problem...

The model

Representing Suspensions as ADTs: For every type () => B in the source program we create an ADT denoted LazyB. For functions f1, f2.. that return B, constructors C1, C2.. are added

Cache encoding. We instrument the expressions of the source

program to explicitly track the changes to the cache as the pro-

gram undergoes evaluation.

Experimental evaluation

Compared the Orb obtained results with instrumented code.
Reasons of inaccuracy :

Forcing to a template.
Inaccuracy due to the tool.

Runtime Vs. Static estimates

lazy numerical rep.

real time queue

Runtime Vs. Static estimates

Cyclic Fibonacci Stream

Cyclic Hamming Stream

Runtime Vs. Static estimates

Levenshtein Distance

Runtime Vs. Static estimates

Lazy Bottom-up merge sort - O(k*log(l.size) + l.size)

Template Minimization

Orb infereed formula: 129*n + 4

Least value of coeff 0 is 4.
The formula that goes through is 129*n + 4.
Counter-example for 129*n + 3 is at the point 0

Least value of coeff 1 is 124.
The formula that goes through is 124*n + 4.
Counter-example for 123*n + 4 is at the point 8000

Minimization report ends here

Report for Cyclic Fibs.

Formula :- a*x + b

subject to a set of inputs

1. a*x + c
2. d*x + b

Compare 1. and 2. with dynamic count.

Bar graph highlighting % accuracy

Why the inaccuracy?

The intermediate functions are indeed accurate

More in the paper

Conclusions and related works.

Lazy evaluation and memoization successfully modelled with good results on real life case studies.
Related works:
Towards Automatic Resource Bound Analysis for OCaml.
Jan Hoffmann, Ankush Das, and Shu-Chun Weng.
Type-based allocation analysis for co-recursion in lazy functional languages. Vasconcelos, Pedro Baltazar; Jost, Steffen; Florido, Mario; Hammond, Kevin.
Analysing the Complexity of Functional Programs: Higher-Order Meets First-Order. Martin Avanzini, Ugo Dal Lago, Georg Moser

Hope you enjoyed!

GitHub: Sumith1896

Twitter: @sumith1896

Email: sumith1896@gmail.com

Thank you and get in touch :)

"Essentially, all models are wrong, but some are useful."

- George E. P. Box

Catch me at

Resource Verification of Lazy Evaluation and Memoization

By Sumith Kulal

Resource Verification of Lazy Evaluation and Memoization

Presentation of the talk "Resource Verification of Lazy Evaluation and Memoization" delivered to team LARA at EPFL on July 13th, 2016.

2,401

Sumith Kulal

Programming Languages, verification and synthesis.