Scalable Synthesis with
Symbolic Syntax Graphs

Rohin Shah Sumith Kulal Rastislav Bodik

A simple computation

max (1 2 8 4 3 7 6 5) = 8

Incremental updates

max (1 2 8 4 3 7 6 5) = 8

max (1 2 3 4 3 7 6 5) =

delta

update

Complexity?

Non-incremental - O(n)

Incremental - O(1)

Motivation

Incremental - efficient, asymptotically fast

Non-incremental - easier to write

Automatically synthesize incremental updates

Application Domains

Probabilistic Programming and Machine Learning Research

Data structures

Specializing an inference algorithm with respect to a model

Sampling Algorithms: Loop in which we modify a small part of a “possible world”

Have many data structures for fast lookup

Automatically repair data structures when data is changed

Why synthesis?

Easy to extend to new domains

Specification is simple, with low annotation burden

Data structure and delta definitions can be arbitrary Rosette programs

Highly performant at runtime, comparable to handwritten code

No dependency analysis at runtime

Static types for each data structure and delta

Formal specification

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\Delta I

\Delta I

\Delta f

\Delta f

Example: Inverse Permutation

3 2 5 7 0 1 4 6

0 1 2 3 4 5 6 7

4 5 1 0 6 2 7 3

0 1 2 3 4 5 6 7

Permutation

Inverse

Updates?

3 2 5 7 0 1 4 6

0 1 2 3 4 5 6 7

4 5 1 0 6 2 7 3

0 1 2 3 4 5 6 7

3 2 5 1 0 7 4 6

0 1 2 3 4 5 6 7

4 5 1 0 6 2 7 3

0 1 2 3 4 5 6 7

\Delta I

\Delta I

Swap any two elements of the initial sequence

Specification

\Delta I

\Delta I

I

O

P

Solution

No runtime dependency analysis or other tracking
Similar to handwritten code

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f

\exists \Delta f

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f

\exists \Delta f

perm: [1, 0, 2, 3] i: 0 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =

f(I_{1} + \Delta I_{1})

f(I_{1} + \Delta I_{1})

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f

\exists \Delta f

perm: [3, 1, 0, 2] i: 2 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =

f(I_{1} + \Delta I_{1})

f(I_{1} + \Delta I_{1})

\wedge

\wedge

\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =

\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =

f(I_{2} + \Delta I_{2})

f(I_{2} + \Delta I_{2})

CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)

\exists \Delta f

\exists \Delta f

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =

f(I_{1} + \Delta I_{1})

f(I_{1} + \Delta I_{1})

\wedge

\wedge

\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =

\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =

f(I_{2} + \Delta I_{2})

f(I_{2} + \Delta I_{2})

Symbolic grammar

Statement

Integer

inverse

vector-set!

constant

Integer

word->text

Vector

perm

Integer

vector-ref

vector-inc!

vector-dec!

word->text

Implementation: Rosette

Rosette extends a subset of Racket with constructs for using a solver (Z3)

Symbolic values allow us to represent a set of values

We can use symbolic values like regular values and make assertions

Algorithm

I

\Delta I

\Delta I

P(I)

P(I)

I + \Delta I

I + \Delta I

P(I + \Delta I)

P(I + \Delta I)

\exists f_{o}

\exists f_{o}

\forall I, \Delta I

\forall I, \Delta I

P(I + \Delta I) =

P(I + \Delta I) =

f_{o}(I, \Delta I, P(I))

f_{o}(I, \Delta I, P(I))

f_{o}(I, \Delta I, P(I))

f_{o}(I, \Delta I, P(I))

How do we scale?

Type analysis – only generate well-typed programs

Sharing of subtrees where possible

> Needed for correctness, else you could just undo the delta

> Greatly reduces the search space

Mutability analysis – don’t mutate some of the values

Use temporary variables to reduce search space size, without losing the type analysis

Results

Search space size
Solution size	3 8 10	10 20 12	12	17	23
Solver time (s)	3 4 30	122 89 96	62	7	75
Total runtime (s)	44	124 91 98	64	12	80

Perm

LDA

Set Updates	Grades
Add Remove Size	Assign Swap Struct

2^{207}

2^{207}

2^{207}

2^{207}

2^{228}

2^{228}

2^{194}

2^{194}

2^{192}

2^{192}

2^{222}

2^{222}

2^{279}

2^{279}

2^{389}

2^{389}

2^{505}

2^{505}

Related work

Self-Adjusting Computation

Static Approaches

> Build a dynamic dependence graph, incrementalize using change propagation

> Recent work done in Adapton

> For a particular subset of programs, statically transform the program to get a new, incremental version

> Incremental view maintenance (databases), object oriented programming, Datalog, invariant checkers, etc.

Future work

Improve scalability - try new encodings

Synthesis of auxiliary data-structures

Explore new application domains

https://github.com/uwplse/syncro/issues

Go here

Thank you all!

@sumith1896

Symbolic grammar

Statement

n-topic-text

new-topic

Integer

Topic

vector-set!

constant

vector-ref

Integer

Word

word->text

word

Vector

Integer

vector-ref

vector-inc!

vector-dec!

old-topic

Automatic Incrementalization through Synthesis

By Sumith Kulal

Automatic Incrementalization through Synthesis

Presentation of the talk "Automatic Incrementalization through Synthesis" delivered to the Bodik group at UW on July 24th, 2017.

1,898

Sumith Kulal

Programming Languages, verification and synthesis.

Scalable Synthesis with Symbolic Syntax Graphs

A simple computation

Incremental updates

Motivation

Application Domains

Why synthesis?

Formal specification

Example: Inverse Permutation

Updates?

Specification

Solution

CEGIS

CEGIS

CEGIS

CEGIS

Symbolic grammar

Implementation: Rosette

Algorithm

How do we scale?

Results

Related work

Future work

Thank you all!

Symbolic grammar

Automatic Incrementalization through Synthesis

More from Sumith Kulal

Scalable Synthesis with
Symbolic Syntax Graphs