Scalable Synthesis with
Symbolic Syntax Graphs

        Rohin Shah       Sumith Kulal      Rastislav Bodik

A simple computation

max (1 2 8 4 3 7 6 5)  =  8

Incremental updates

max (1 2 8 4 3 7 6 5)  =  8

max (1 2 3 4 3 7 6 5)  =    

delta

update

Complexity?

Non-incremental - O(n)

Incremental   - O(1)

7

Motivation

Incremental - efficient, asymptotically fast

Non-incremental - easier to write

Automatically synthesize incremental updates

Application Domains

  • Probabilistic Programming and Machine Learning Research
     
  • Data structures
  • Specializing an inference algorithm with respect to a model
  • Sampling Algorithms: Loop in which we modify a small part of a “possible world”
  • Have many data structures for fast lookup
  • Automatically repair data structures when data is changed

Why synthesis?

  • Easy to extend to new domains
  • Specification is simple, with low annotation burden
  • Data structure and delta definitions can be arbitrary Rosette programs
  • Highly performant at runtime, comparable to handwritten code
  • No dependency analysis at runtime
  • Static types for each data structure and delta

Formal specification

P

I

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

O

O'

I'

\Delta I
ΔI\Delta I
\Delta f
Δf\Delta f

Example: Inverse Permutation

3  2  5  7  0  1  4  6

0  1  2  3  4  5  6  7

4  5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

Permutation

Inverse

Updates?

3  2  5  7  0  1  4  6

0  1  2  3  4  5  6  7

4  5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

3  2  5  1  0  4  6

0  1  2  3  4  5  6  7

5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

P

\Delta I
ΔI\Delta I

?

Swap any two elements of the initial sequence

Specification

\Delta I
ΔI\Delta I
I
II
O
OO
P
PP

Solution

  • No runtime dependency analysis or other tracking
  • Similar to handwritten code

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f

perm: [1, 0, 2, 3] i: 0 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
 O1 + Δf(I1,ΔI1,O1)=\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
f(I_{1} + \Delta I_{1})
f(I1+ΔI1) f(I_{1} + \Delta I_{1})

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f

perm: [3, 1, 0, 2] i: 2 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
 O1 + Δf(I1,ΔI1,O1)=\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
f(I_{1} + \Delta I_{1})
f(I1+ΔI1) f(I_{1} + \Delta I_{1})
\wedge
\wedge
\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
 O2 + Δf(I2,ΔI2,O2)=\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
f(I_{2} + \Delta I_{2})
f(I2+ΔI2) f(I_{2} + \Delta I_{2})

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f
\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
 O1 + Δf(I1,ΔI1,O1)=\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
f(I_{1} + \Delta I_{1})
f(I1+ΔI1) f(I_{1} + \Delta I_{1})
\wedge
\wedge
\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
 O2 + Δf(I2,ΔI2,O2)=\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
f(I_{2} + \Delta I_{2})
f(I2+ΔI2) f(I_{2} + \Delta I_{2})

Symbolic grammar

Statement

Integer

inverse

vector-set!

constant

Integer

Integer

word->text

i

Vector

Vector

perm

Integer

Integer

vector-ref

+

*

vector-inc!

vector-dec!

j

word->text

Implementation: Rosette

  • Rosette extends a subset of Racket with constructs for using a solver (Z3)
  • Symbolic values allow us to represent a set of values 
  • We can use symbolic values like regular values and make assertions

Algorithm

I
II
\Delta I
ΔI\Delta I
P(I)
P(I)P(I)
I + \Delta I
I+ΔII + \Delta I
P(I + \Delta I)
P(I+ΔI)P(I + \Delta I)
\exists f_{o}
fo\exists f_{o}
\forall I, \Delta I
I,ΔI\forall I, \Delta I
P(I + \Delta I) =
P(I+ΔI)=P(I + \Delta I) =
f_{o}(I, \Delta I, P(I))
fo(I,ΔI,P(I))f_{o}(I, \Delta I, P(I))
f_{o}(I, \Delta I, P(I))
fo(I,ΔI,P(I))f_{o}(I, \Delta I, P(I))

How do we scale?

  • Type analysis – only generate well-typed programs
  • Sharing of subtrees where possible

> Needed for correctness, else you could just undo the delta

> Greatly reduces the search space

  • Mutability analysis – don’t mutate some of the values
  • Use temporary variables to reduce search space size, without losing the type analysis

Results

Search space size
Solution size 3            8         10   10       20        12 12 17 23
Solver time (s) 3            4         30 122       89        96 62 7 75
Total runtime (s)              44            124       91        98 64 12 80
Perm LDA MB
Set Updates Grades
Add            Remove         Size Assign         Swap        Struct
2^{207}
22072^{207}
2^{207}
22072^{207}
2^{228}
22282^{228}
2^{194}
21942^{194}
2^{192}
21922^{192}
2^{222}
22222^{222}
2^{279}
22792^{279}
2^{389}
23892^{389}
2^{505}
25052^{505}

Related work

  • Self-Adjusting Computation
  • Static Approaches

> Build a dynamic dependence graph, incrementalize using change propagation

> Recent work done in Adapton

> For a particular subset of programs, statically transform the program to get a new, incremental version

> Incremental view maintenance (databases), object oriented programming, Datalog, invariant checkers, etc.

Future work

  • Improve scalability - try new encodings
  • Synthesis of auxiliary data-structures
  • Explore new application domains

Go here

Thank you all!

@sumith1896

Symbolic grammar

Statement

n-topic-text

new-topic

Integer

Topic

vector-set!

constant

vector-ref

Integer

Word

word->text

word

Vector

Vector

Vector

Integer

Integer

vector-ref

+

*

vector-inc!

vector-dec!

old-topic

Automatic Incrementalization through Synthesis

By Sumith Kulal

Automatic Incrementalization through Synthesis

Presentation of the talk "Automatic Incrementalization through Synthesis" delivered to the Bodik group at UW on July 24th, 2017.

  • 1,785