SinScheme

A Compiler For No One

Presented By: DAVIS SILVERMAN

What is a Compiler?

A Text Processor

Source Language Code

Destination

Language

Code

??????

LISP

LISt Processing

Lots of Irritating, Silly Parens

(define bar 5)

(define (square n) (* n n))

(define values 
        (let ([xs '(1 2 3 4 5)])
             (map (lambda (e) (+ (square e) 1)) xs)))

Lisp (Racket)

(* 2 (+ 3 4))

Compiling Lisp

  • Super easy to parse!
    • Only have to worry about actual code generation
  • Lisp is well understood academically
  • Lisp is easily understood by beginners

LLVM

Generating an Executable is HARD

  • Register allocation + other academic problems
  • Platform dependencies
  • Assembly is not a great language

LLVM Makes Generating Executables Easy!

  • Type safe assembly language
  • Many supported platforms
  • Handles all 'real' code generation

Enter SinScheme

Scheme -> LLVM IR Compiler

https://github.com/sinistersnare/sinscheme

SinScheme

A Tour of the various compilation phases

Functional Compilers

  • Much Like functional programming, about breaking code into many smaller problems, and solving each individually
  • Many phases of compilation
    • Lots of PL theory here!!

e ::= (define x e)

    | (define (x x ... defaultparam ...) e ...+)

    | (define (x x ... . x) e ...+)

    | (letrec* ([x e] ...) e ...+)

    | (letrec ([x e] ...) e ...+)

    | (let* ([x e] ...) e ...+)

    | (let ([x e] ...) e ...+)

    | (let x ([x e] ...) e ...+)

| (lambda (x ... defaultparam ...) e ...+)

    | (lambda x e ...+)

    | (lambda (x ...+ . x) e ...+)

    | (dynamic-wind e e e)

    | (guard (x cond-clause ...) e ...+)

    | (raise e)

    | (delay e)

    | (force e)

    | (and e ...)

    | (or e ...)

    | (match e match-clause ...)

    | (cond cond-clause ...)

    | (case e case-clause ...)

    | (if e e e)

    | (when e e ...+)

    | (unless e e ...+)

    | (set! x e)

    | (begin e ...+)

    | (call/cc e)

    | (apply e e)

    | (e e ...)

    | x

    | op

    | (quasiquote qq)

    | (quote dat)

    | nat | string | #t | #f

 

 

   

cond-clause ::= (e) | (e e e ...) | (else e e ...)

case-clause ::= ((dat ...) e e ...) | (else e e ...)

match-clause ::= (pat e e ...) | (else e e ...)

; in all cases, else clauses must come last

dat is a datum satisfying datum? from utils.rkt

x is a variable (satisfies symbol?)

defaultparam ::= (x e)

op is a symbol satisfying prim? from utils.rkt (if not otherwise in scope)

op ::= promise? | null? | cons | car | + | ...  (see utils.rkt)

qq ::= e | dat | (unquote qq) | (unquote e) | (quasiquote qq)

     | (qq ...+) | (qq ...+ . qq)

;; (quasiquote has the same semantics as in Racket)

pat ::= nat | string | #t | #f | (quote dat) | x | (? e pat) | (cons pat pat) | (quasiquote qqpat)

qqpat ::= e | dat | (unquote qqpat) | (unquote pat) | (quasiquote qq)

        | (qq ...+) | (qq ...+ . qq)

;; (same semantics as Racket match for this subset of patterns)

Top Level Translations

  • Removes pattern matching
  • Quotes all datums
  • Removes all defines
    • *creating one big giant expression*
      • Reminiscent of the Lambda Calculus

 

e ::= (letrec* ([x e] ...) e)

    | (letrec ([x e] ...) e)

    | (let* ([x e] ...) e)

    | (let ([x e] ...) e)

    | (let x ([x e] ...) e)

    | (lambda (x ...) e)

    | (lambda x e)

    | (lambda (x ...+ . x) e)

    | (dynamic-wind e e e)

    | (guard (x cond-clause ...) e)

    | (raise e)

    | (delay e)

    | (force e)

    | (and e ...)

    | (or e ...)

    | (cond cond-clause ...)

    | (case e case-clause ...)

    | (if e e e)

    | (when e e)

    | (unless e e)

    | (set! x e)

    | (begin e ...+)

    | (call/cc e)

    | (apply e e)

    | (e e ...)

    | x

    | op

    | (quote dat)

 

cond-clause ::= (e) | (e e) | (else e)  ; in all test cases

case-clause ::= ((dat ...) e) | (else e)  ; else clauses always come last

dat is a datum satisfying datum? from utils.rkt

x is a variable (satisfies symbol?)

op is a symbol satisfying prim? from utils.rkt (if not otherwise in scope)

op ::= promise? | null? | cons | car | + | ...  (see utils.rkt)

Desugaring

  • Removes some unneeded sugar
  • Turns all bindings into let bindings
  • Desugars promises and exception handling

e ::= (let ([x e] ...) e)

    | (lambda (x ...) e)

    | (lambda x e)

    | (apply e e)

    | (e e ...)

    | (prim op e ...)

    | (apply-prim op e)

    | (if e e e)

    | (set! x e)

    | (call/cc e)

    | x

    | (quote dat)

 

 

dat is a datum satisfying datum? from utils.rkt

x is a variable (satisfies symbol?)

op is a symbol satisfying prim? from utils.rkt (if not otherwise in scope)

op ::= promise? | null? | cons | car | + | ...  (see utils.rkt)

Assignment Conversion

  • Removes `set!` AKA mutation from the language

Alphatization

  • Ensures that there is no variable shadowing
  • This allows us to de-nest all `let` forms, as there will be no ambiguity

e ::= (let ([x e] ...) e)

    | (lambda (x ...) e)

    | (lambda x e)

    | (apply e e)

    | (e e ...)

    | (prim op e ...)

    | (apply-prim op e)

    | (if e e e)

    | (call/cc e)

    | x

    | (quote dat)

 

 

dat is a datum satisfying datum? from utils.rkt

x is a variable (satisfies symbol?)

op is a symbol satisfying prim? from utils.rkt (if not otherwise in scope)

op ::= promise? | null? | cons | car | + | ...  (see utils.rkt)

Administrative Normal Form

  • Lifts all 'complex' expressions into let bindings
  • This forces an evaluation order for all expressions

e ::= (let ([x e]) e)

    | (apply ae ae)

    | (ae ae ...)

    | (prim op ae ...)

    | (apply-prim op ae)

    | (if ae e e)

    | (call/cc ae)

    | ae

ae ::= (lambda (x ...) e)

     | (lambda x e)

     | x

     | (quote dat)

Continuation Passing Style

  • Turns all functions tail-recursive
  • Program no longer ever returns
  • At end of chain, we halt
  • This allows us to use TRE for all functions

e ::= (let ([x (apply-prim op ae)]) e)

    | (let ([x (prim op ae ...)]) e)

    | (let ([x (lambda (x ...) e)]) e)

    | (let ([x (lambda x e)]) e)

    | (let ([x (quote dat)]) e)

    | (apply ae ae)

    | (ae ae ...)

    | (if ae e e)

ae ::= (lambda (x ...) e)

     | (lambda x e)

     | x

     | (quote dat)

Closure Conversion

  • Turns lisp into a more imperative procedure based language
  • Lift all lambdas to top-level again, with explicit environments
  • Executable code simply calls the various lambdas

p ::= ((proc (x x ...) e) ...)

e ::= (let ([x (apply-prim op x)]) e)

    | (let ([x (prim op x ...)]) e)

    | (let ([x (make-closure x x ...)]) e)

    | (let ([x (env-ref x nat)]) e)

    | (let ([x (quote dat)]) e)

    | (clo-app x x ...)

    | (if x e e)

dat is a datum satisfying datum? from utils.rkt

x is a variable (satisfies symbol?)

op is a symbol satisfying prim? from utils.rkt (if not already removed)

nat is a natural number satisfying natural? or integer?

LLVM Code Emission

  • Proc-language -> LLVM code is easy
  • Simple transformations depending on token to LLVM IR code
  • Also ensures all variables are stack allocated for garbage collection***

SinScheme Runtime

  • C++ code
  • Generates code that is used by the language
  • Primitives written here
  • Garbage collection + Object layout

Code Review?!?!?

  • Anyone Interested in any part in particular?
  • Runtime is super fun!
    • Would love to talk about garbage collection if anyone is interested

Thanks!

Final Questions?

Find me @Sinistersnare

CompilersPresentation

By Davis Silverman

CompilersPresentation

  • 858