Automatic Generation of Peephole Superoptimizers
Sorav Bansal and Alex Aiken
Gokulan R
CS15B033
29 April 2020
Summary of the paper
- Peephole optimization: Replace one instruction sequence with a better instruction sequence
- Goal: Automatically build sequences instead of hand-tuned optimizations
- Three-part optimizer:
- harvester: extract instruction sequences to optimize
- enumerator: enumerates all possible sequences upto a certain length exhaustively, checking if each candidate is a replacement
- optimization database: index of all discovered optimizations to applications
Key Algorithm
offline training phase: // input: training program
harvester: extract instruction sequences
canonicalizer: rename registers to reduce search space
fingerprinter: compute fingerprint for instruction sequence
and store in fingerprinting table
enumerator: generate sequence of possible assembly instructions
fingerprinting: fingerprinting of a generated sequence
search in fingerprinting table: check if any equivalent sequence
is present in training program
if yes:
boolean check: check if contexts match
if yes:
store in optimization table
online optimization phase: // input: program to compile
while no optimization exists:
harvest an instruction sequence
canonicalize it
fingerprint it
search optimization table
replace sequence if optimization found
Future Work
- Increasing the length of instruction sequence with better compute resources
- Can be extended to PTX assembly code used in CUDA supported GPUs
- Current work optimizes for CISC assembly - can be extended to RISC assembly instruction - MIPS, RISC-V
- Efficient compression scheme for storing in optimization database
- Sequence-specific testvector generation to verify correctness
automatic_peephole_superoptimizers
By Gokulan Ravi
automatic_peephole_superoptimizers
- 146