Scalable Synthesis with
Symbolic Syntax Graphs

        Rohin Shah       Sumith Kulal      Rastislav Bodik

A simple computation

max (1 2 8 4 3 7 6 5)  =  8

Incremental updates

max (1 2 8 4 3 7 6 5)  =  8

max (1 2 3 4 3 7 6 5)  =    

delta

update

Complexity?

Non-incremental - O(n)

Incremental   - O(1)

7

Motivation

Incremental - efficient, asymptotically fast

Non-incremental - easier to write

Automatically synthesize incremental updates

Application Domains

  • Probabilistic Programming and Machine Learning Research
     
  • Data structures
  • Specializing an inference algorithm with respect to a model
  • Sampling Algorithms: Loop in which we modify a small part of a “possible world”
  • Have many data structures for fast lookup
  • Automatically repair data structures when data is changed

Why synthesis?

  • Easy to extend to new domains
  • Specification is simple, with low annotation burden
  • Data structure and delta definitions can be arbitrary Rosette programs
  • Highly performant at runtime, comparable to handwritten code
  • No dependency analysis at runtime
  • Static types for each data structure and delta

Formal specification

P

I

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

O

O'

I'

\Delta I
ΔI\Delta I
\Delta f
Δf\Delta f

Example: Inverse Permutation

3  2  5  7  0  1  4  6

0  1  2  3  4  5  6  7

4  5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

Permutation

Inverse

Updates?

3  2  5  7  0  1  4  6

0  1  2  3  4  5  6  7

4  5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

3  2  5  1  0  4  6

0  1  2  3  4  5  6  7

5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

P

\Delta I
ΔI\Delta I

?

Swap any two elements of the initial sequence

Specification

\Delta I
ΔI\Delta I
I
II
O
OO
P
PP

Solution

  • No runtime dependency analysis or other tracking
  • Similar to handwritten code

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f

perm: [1, 0, 2, 3] i: 0 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
 O1 + Δf(I1,ΔI1,O1)=\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
f(I_{1} + \Delta I_{1})
f(I1+ΔI1) f(I_{1} + \Delta I_{1})

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f

perm: [3, 1, 0, 2] i: 2 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
 O1 + Δf(I1,ΔI1,O1)=\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
f(I_{1} + \Delta I_{1})
f(I1+ΔI1) f(I_{1} + \Delta I_{1})
\wedge
\wedge
\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
 O2 + Δf(I2,ΔI2,O2)=\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
f(I_{2} + \Delta I_{2})
f(I2+ΔI2) f(I_{2} + \Delta I_{2})

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
Δf I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI)\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
I,ΔI : O + Δf(I,ΔI,O)=f(I+ΔI) \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
\exists \Delta f
Δf\exists \Delta f
\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
 O1 + Δf(I1,ΔI1,O1)=\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
f(I_{1} + \Delta I_{1})
f(I1+ΔI1) f(I_{1} + \Delta I_{1})
\wedge
\wedge
\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
 O2 + Δf(I2,ΔI2,O2)=\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
f(I_{2} + \Delta I_{2})
f(I2+ΔI2) f(I_{2} + \Delta I_{2})

Symbolic grammar

Statement

Integer

inverse

vector-set!

constant

Integer

Integer

word->text

i

Vector

Vector

perm

Integer

Integer

vector-ref

+

*

vector-inc!

vector-dec!

j

word->text

Implementation: Rosette

  • Rosette extends a subset of Racket with constructs for using a solver (Z3)
  • Symbolic values allow us to represent a set of values 
  • We can use symbolic values like regular values and make assertions

Algorithm

I
II
\Delta I
ΔI\Delta I
P(I)
P(I)P(I)
I + \Delta I
I+ΔII + \Delta I
P(I + \Delta I)
P(I+ΔI)P(I + \Delta I)
\exists f_{o}
fo\exists f_{o}
\forall I, \Delta I
I,ΔI\forall I, \Delta I
P(I + \Delta I) =
P(I+ΔI)=P(I + \Delta I) =
f_{o}(I, \Delta I, P(I))
fo(I,ΔI,P(I))f_{o}(I, \Delta I, P(I))
f_{o}(I, \Delta I, P(I))
fo(I,ΔI,P(I))f_{o}(I, \Delta I, P(I))

How do we scale?

  • Type analysis – only generate well-typed programs
  • Sharing of subtrees where possible

> Needed for correctness, else you could just undo the delta

> Greatly reduces the search space

  • Mutability analysis – don’t mutate some of the values
  • Use temporary variables to reduce search space size, without losing the type analysis

Results

Search space size
Solution size 3            8         10   10       20        12 12 17 23
Solver time (s) 3            4         30 122       89        96 62 7 75
Total runtime (s)              44            124       91        98 64 12 80
Perm LDA MB
Set Updates Grades
Add            Remove         Size Assign         Swap        Struct
2^{207}
22072^{207}
2^{207}
22072^{207}
2^{228}
22282^{228}
2^{194}
21942^{194}
2^{192}
21922^{192}
2^{222}
22222^{222}
2^{279}
22792^{279}
2^{389}
23892^{389}
2^{505}
25052^{505}

Related work

  • Self-Adjusting Computation
  • Static Approaches

> Build a dynamic dependence graph, incrementalize using change propagation

> Recent work done in Adapton

> For a particular subset of programs, statically transform the program to get a new, incremental version

> Incremental view maintenance (databases), object oriented programming, Datalog, invariant checkers, etc.

Future work

  • Improve scalability - try new encodings
  • Synthesis of auxiliary data-structures
  • Explore new application domains

Go here

Thank you all!

@sumith1896

Symbolic grammar

Statement

n-topic-text

new-topic

Integer

Topic

vector-set!

constant

vector-ref

Integer

Word

word->text

word

Vector

Vector

Vector

Integer

Integer

vector-ref

+

*

vector-inc!

vector-dec!

old-topic

Made with Slides.com