Jiangyi Liu, Xingyu Xie
# Superoptimization
Enumerative search
Stochastic search
Synthesis-based
# Enumerative search
Basic idea: enumerate all paths of cost lower than the one to optimize.
Many techniques for speedup:
(ASPLOS'13) Phitchaya Mangpo Phothilimthana et al. Scaling up Superoptimization
# Stochastic search
program space
(blue area is the equivalent programs to the target)
cost function
Basic idea: (Metropolis-Hastings algorithm) mutate the current program, and accepted with a probability according to the cost.
(ASPLOS'13) Eric Schkufza et al. Stochastic Superoptimization
# Equivalence checker
Data-driven non-deterministic approach:
(OOPSLA'13) Rahul Sharm et al. Data-Driven Equivalence Checking
(PLDI'19) Berkeley Churchill et al. Semantic Program Alignment for Equivalence Checking
# Testcases
(ASPLOS'17) Berkeley Churchill et al. Sound Loop Superoptimization for Google Native Client
# Reinforcement learning
Mutations:
Use RL to learn the distribution to perform each mutation, which is uniform before.
(ICLR'17) Rudy Bunel et al. Learning to superoptimize programs
(DL4C workshop at ICLR'22) Alex Shypula et al. Learning to superoptimize real-world programs
# SYNTHESIS
Souper: A Synthesizing Superoptimizer
- Component-based Synthesis; CEGIS
- Target: loop-free subset of LLVM IR
Raimondas Sasnauskas et al. A Synthesizing Superoptimizer
From Software Analysis (2021), instructed by Yingfei Xiong, Peking University
What is component-based synthesis?
The user provides a library of components:
\( \{\langle\vec I_i, O_i, \phi_i(\vec I_i, O_i)\rangle \mid i = 1 \dots N \}\)
- Here \(\phi_i\) is the constraint on a component
- to allow multiple use of a component, put more than one copies in the library
Besides, a specification of the desired program should be given:
\( \langle \vec I, O, \phi_{spec}(\vec I, O)\rangle \)
GOAL: Find a straight-line program that only uses components given in the library
- A mental model: components are connected by input/output relations
- f_impl should satisfy the following formula
i.e. for every combination of input value & temporary variable values, if the spec for components holds, then f_impl meets \(\phi_{spec}\)
Encode connections in SMT formulas.
- Divide I/O vars into sets:
- \(\mathbf{P} = \bigcup_{1 \le i \le N} \vec I_i\), \(\mathbf{R} = \{O_1, \dots, O_N\}\)
- Location = Line number OR input variable
- Location 0 ~ Location (M-1): input for every component
- Location M, ...: assignment to temp var / final output
- \(M = \sum_{1 \le i \le N} \mathsf{arity}(\vec I_i)\)
- Consistency: locations are distinct
\[\psi_{cons} := \bigwedge_{x,y\in\mathbf{R},x\ne y} l_x \ne l_y\]
- Acyclic: all connections don't form a loop
\[\psi_{acyc} := \bigwedge_{1 \le i \le N} \left(\bigwedge_{x \in \vec I_i} l_x < l_{O_i}\right)\]
Encode connections in SMT formulas (cont.)
- wfp = cons + acyc + bounding locations
\[\psi_{\mathrm{wfp}}(L):=\bigwedge_{x \in \mathbf{P}}\left(0 \leq l_{x} \leq M-1\right) \wedge \bigwedge_{x \in \mathbf{R}}\left(|\vec{I}| \leq l_{x} \leq M-1\right) \wedge\\ \psi_{cons}(\mathbf{L}) \wedge \psi_{acyc}(\mathbf{L})\]
where \(L\) stands for the set of locations
- \(\phi_{lib}\): library specs
\[\phi_{lib} := \bigwedge_{1 \le i \le N} \phi_i(\vec I_i, O_i)\]
- \(\psi_{conn}\): connection
\[\psi_{conn} := \bigwedge_{x,y \in \mathbf{P} \cup \mathbf{R} \cup \vec I \cup \{O\}} (l_x = l_y \to x = y)\]
Encode connections in SMT formulas (cont.)
Synthesis Constraint
Counterexample Guided Inductive Synthesis
- solves \(\exists L \forall \vec I: \phi(L, \vec I)\)
- \(\mathcal{S}\): finite set containing valuation of \vec I
- Loop:
- find \(L\) that satisfies \(\phi(L, \vec I_i)\), where \(I_i \in \mathcal{S}\)
- check if \(L\) satisfies the \(\forall\)-clause;
- if not, add the counterexample to \(\mathcal{S}\)
- else, L corresponds to the synthesized program
Souper is a superoptimizer based on CEGIS and component-based synthesis.
`infer` marks the entry of superoptimizer.
CEGIS process is wrapped in another loop.
Thus the cost of yielded program is bounded, and the 1st result is always the optimized version.
# Expansions
(PLDI'14) Eric Schkufza et al. Stochastic Optimization of Floating-Point Programs with Tunable Precision
(OOPSLA'15) Rahul Sharma et al. Conditionally Correct Superoptimization
(ASPLOS'13) Phitchaya Mangpo Phothilimthana et al. Scaling up Superoptimization
# Comparison
Tool | Approach | Language | Size | Loop? |
---|---|---|---|---|
STOKE | Stochastic search | x86-64 | ~100 inst, <=200 inst |
Y |
Souper | Synthesis | Souper IR (from LLVM IR) | 1KB | N |
GreenThumb | Enumerative search | ARMv7-A, GreenArrays |
~10 inst, <=30 inst | N |
# Plans
Reproduce STOKE on ARM64
# Research possibilities