Automatic Generation of Peephole Superoptimizers

Sorav Bansal and Alex Aiken

Gokulan R

CS15B033
29 April 2020

Summary of the paper

  • Peephole optimization: Replace one instruction sequence with a better instruction sequence
  • Goal: Automatically build sequences instead of hand-tuned optimizations
  • Three-part optimizer:
    • harvester: extract instruction sequences to optimize
    • enumerator: enumerates all possible sequences upto a certain length exhaustively, checking if each candidate is a replacement
    • optimization database: index of all discovered optimizations to applications

Key Algorithm

offline training phase: // input: training program
    harvester: extract instruction sequences
    canonicalizer: rename registers to reduce search space
    fingerprinter: compute fingerprint for instruction sequence 
    and store in fingerprinting table

    enumerator: generate sequence of possible assembly instructions
        fingerprinting: fingerprinting of a generated sequence
        search in fingerprinting table: check if any equivalent sequence
        is present in training program
        if yes:
            boolean check: check if contexts match
            if yes:
                store in optimization table

online optimization phase: // input: program to compile
    while no optimization exists:
    	harvest an instruction sequence
        canonicalize it
        fingerprint it
        search optimization table
        replace sequence if optimization found

Future Work

  • Increasing the length of instruction sequence with better compute resources
  • Can be extended to PTX assembly code used in CUDA supported GPUs
  • Current work optimizes for CISC assembly - can be extended to RISC assembly instruction - MIPS, RISC-V
  • Efficient compression scheme for storing in optimization database
  • Sequence-specific testvector generation to verify correctness

automatic_peephole_superoptimizers

By Gokulan Ravi

automatic_peephole_superoptimizers

  • 146