### Scalable Synthesis with Symbolic Syntax Graphs

Rohin Shah       Sumith Kulal      Rastislav Bodik

## A simple computation

max (1 2 8 4 3 7 6 5)  =  8

max (1 2 8 4 3 7 6 5)  =  8

max (1 2 3 4 3 7 6 5)  =

delta

update

Complexity?

Non-incremental - O(n)

Incremental   - O(1)

7

## Motivation

Incremental - efficient, asymptotically fast

Non-incremental - easier to write

## Application Domains

• Probabilistic Programming and Machine Learning Research

• Data structures
• Specializing an inference algorithm with respect to a model
• Sampling Algorithms: Loop in which we modify a small part of a “possible world”
• Have many data structures for fast lookup
• Automatically repair data structures when data is changed

## Why synthesis?

• Easy to extend to new domains
• Specification is simple, with low annotation burden
• Data structure and delta definitions can be arbitrary Rosette programs
• Highly performant at runtime, comparable to handwritten code
• No dependency analysis at runtime
• Static types for each data structure and delta

## Formal specification

P

I

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$

O

O'

I'

\Delta I
$\Delta I$
\Delta f
$\Delta f$

## Example: Inverse Permutation

3  2  5  7  0  1  4  6

0  1  2  3  4  5  6  7

4  5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

Permutation

Inverse

3  2  5  7  0  1  4  6

0  1  2  3  4  5  6  7

4  5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

3  2  5  1  0  4  6

0  1  2  3  4  5  6  7

5  1  0  6  2  7  3

0  1  2  3  4  5  6  7

P

\Delta I
$\Delta I$

?

Swap any two elements of the initial sequence

## Specification

\Delta I
$\Delta I$
I
$I$
O
$O$
P
$P$

## Solution

• No runtime dependency analysis or other tracking
• Similar to handwritten code

## CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\exists \Delta f
$\exists \Delta f$

## CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\exists \Delta f
$\exists \Delta f$

perm: [1, 0, 2, 3] i: 0 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
$\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =$
f(I_{1} + \Delta I_{1})
$f(I_{1} + \Delta I_{1})$

## CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\exists \Delta f
$\exists \Delta f$

perm: [3, 1, 0, 2] i: 2 j: 1

\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
$\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =$
f(I_{1} + \Delta I_{1})
$f(I_{1} + \Delta I_{1})$
\wedge
$\wedge$
\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
$\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =$
f(I_{2} + \Delta I_{2})
$f(I_{2} + \Delta I_{2})$

## CEGIS

Guesser

Verifier

\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\exists \Delta f \ \forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)
$\forall I, \Delta I \ : \ O \ + \ \Delta f(I, \Delta I, O) = f(I + \Delta I)$
\exists \Delta f
$\exists \Delta f$
\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =
$\ O_{1} \ + \ \Delta f(I_{1}, \Delta I_{1}, O_{1}) =$
f(I_{1} + \Delta I_{1})
$f(I_{1} + \Delta I_{1})$
\wedge
$\wedge$
\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =
$\ O_{2} \ + \ \Delta f(I_{2}, \Delta I_{2}, O_{2}) =$
f(I_{2} + \Delta I_{2})
$f(I_{2} + \Delta I_{2})$

Statement

Integer

inverse

vector-set!

constant

Integer

Integer

word->text

i

Vector

Vector

perm

Integer

Integer

vector-ref

+

*

vector-inc!

vector-dec!

j

word->text

## Implementation: Rosette

• Rosette extends a subset of Racket with constructs for using a solver (Z3)
• Symbolic values allow us to represent a set of values
• We can use symbolic values like regular values and make assertions

## Algorithm

I
$I$
\Delta I
$\Delta I$
P(I)
$P(I)$
I + \Delta I
$I + \Delta I$
P(I + \Delta I)
$P(I + \Delta I)$
\exists f_{o}
$\exists f_{o}$
\forall I, \Delta I
$\forall I, \Delta I$
P(I + \Delta I) =
$P(I + \Delta I) =$
f_{o}(I, \Delta I, P(I))
$f_{o}(I, \Delta I, P(I))$
f_{o}(I, \Delta I, P(I))
$f_{o}(I, \Delta I, P(I))$

## How do we scale?

• Type analysis – only generate well-typed programs
• Sharing of subtrees where possible

> Needed for correctness, else you could just undo the delta

> Greatly reduces the search space

• Mutability analysis – don’t mutate some of the values
• Use temporary variables to reduce search space size, without losing the type analysis

## Results

 Search space size Solution size 3            8         10 10       20        12 12 17 23 Solver time (s) 3            4         30 122       89        96 62 7 75 Total runtime (s) 44 124       91        98 64 12 80
 Perm LDA MB
Add            Remove         Size Assign         Swap        Struct
2^{207}
$2^{207}$
2^{207}
$2^{207}$
2^{228}
$2^{228}$
2^{194}
$2^{194}$
2^{192}
$2^{192}$
2^{222}
$2^{222}$
2^{279}
$2^{279}$
2^{389}
$2^{389}$
2^{505}
$2^{505}$

## Related work

• Static Approaches

> Build a dynamic dependence graph, incrementalize using change propagation

> Recent work done in Adapton

> For a particular subset of programs, statically transform the program to get a new, incremental version

> Incremental view maintenance (databases), object oriented programming, Datalog, invariant checkers, etc.

## Future work

• Improve scalability - try new encodings
• Synthesis of auxiliary data-structures
• Explore new application domains

Go here

@sumith1896

Statement

n-topic-text

new-topic

Integer

Topic

vector-set!

constant

vector-ref

Integer

Word

word->text

word

Vector

Vector

Vector

Integer

Integer

vector-ref

+

*

vector-inc!

vector-dec!

old-topic

By Sumith Kulal

# Automatic Incrementalization through Synthesis

Presentation of the talk "Automatic Incrementalization through Synthesis" delivered to the Bodik group at UW on July 24th, 2017.

• 1,711