Better, Faster, Stronger

with macros & compiler plugins

 

What's up?

 

  • Intro to macros (the good parts)
  • Better (looking): macro-based DSLs
  • Faster: ScalaCL, Scalaxy, Reified
  • Stronger: Parano, JSON
  • Gotchas

 

Awful slides full of code, sorry :-S

Macros?

What macros look like

import scala.reflect.macros.blackbox.Context
import scala.language.experimental.macros

object EmailChecker {
  def check[A](a: A): A = macro checkImpl[A]

  def checkImpl[A: c.WeakTypeTag](c: Context)(a: c.Expr[A]): c.Expr[A] = {
    import c.universe._
    
    new Traverser {
        override def traverse(tree: Tree) = tree match {
            case Literal(Constant(s: String)) if s.contains("@gmail.com") =>
                c.warning(tree.pos, "Wait, did you leave an email here?")
            case _ =>
                super.traverse(tree)
        }
    } traverse (a.tree)

    a
  }
}
val x = EmailChecker.check {
  val email = "olivier.chafik@gmail.com"
}             |
              |
              ^ WARNING: Wait, did you leave an email here?

Macros:

 

How can it make our code look Better?

implicit class FooExt(foo: Foo) {
  def bar = s"$foo bar"
}
new Foo().bar
          ⬇
new FooExt(new Foo()).bar
implicit case class FooExt(foo: Foo)
    extends AnyVal {
  def bar = s"$foo bar"
}
new Foo().bar
          ⬇
FooExt.bar$extension(new Foo())
  • Value class trick doesn't work with implicit params

Macro equivalent

implicit class FooExt(foo: Foo) {
  def bar: String = macro barImpl
}

def barImpl(c: Context): c.Expr[String] = {
    import c.universe._

    val q"new FooExt($foo)" = c.prefix

    c.Expr[String](q""" s"$foo bar" """)
}
new Foo().bar

          ⬇

s"${new Foo()} bar"
implicit case class FooExt(foo: Foo)
    extends AnyVal {
  def bar = s"$foo bar"
}
new Foo().bar

          ⬇

FooExt.bar$extension(new Foo())
  • Guaranteed inlining
  • No runtime dep

rehabilitate 💕structural types!

import javafx.beans.property.BooleanProperty

implicit class AnimatedGetterAndSetter(o: { def animatedProperty(): BooleanProperty }) {
  def animated: Boolean = macro impl.get[Boolean]
  def animated_=(value: Boolean): Unit = macro impl.set[Boolean]
}
val axis: Axis = ...

axis.animated = true

          ⬇ typer

new AnimatedGetterAndSetter(axis).animated = true

          ⬇ macro expansion

axis.animatedProperty.setValue(true)

Implicit resolution with macros

trait HasAnnotation[T, A]

implicit def hasAnnotation[T, A]: HasAnnotation[T, A] = macro impl[T, A]
  
def impl[T : c.WeakTypeTag, A : c.WeakTypeTag]
        (c: Context): c.Expr[HasAnnotation[T, A]] = {
  import c.universe._
   
  val found = weakTypeOf[T].typeSymbol.asType.annotations
    .exists(a => a.tree.tpe <:< weakTypeOf[A])
   
  if (!found) c.error(c.enclosingPosition, "annotation not found") 

  c.Expr[HasAnnotation[T, A]](q"null")
}
type IsEntity[T] = HasAnnotation[T, javax.persistence.Entity]

type IsNotDeprecated[T] = HasNoAnnotation[T, Deprecated]

def serialize[T <: Serializable : IsEntity : IsNotDeprecated](entity: T) = ???

Dynamic + macros = 💘

implicit def beansExtensions[T](bean: T) = new {
  def set = new Dynamic {
    def applyDynamicNamed(name: String)(args: (String, Any)*): T =
      macro impl.applyDynamicNamedImpl[T]
    }
  }
}
new Button().set(text = "Ok", scaleX = 0.5)

          ⬇ typer

beansExtensions(new Button()).set
  .apply(text = "Ok", scaleX = 0.5)

          ⬇ "apply does not exist, but called on a Dynamic"

beansExtensions(new Button()).set
  .applyDynamicNamed("apply")("text" -> "Ok",
                              "scaleX" -> 0.5)

          ⬇ macro expansion
{
  val b = new Button()
  b.setText("Ok")
  b.setScaleX(0.5)
  b
}

magic bindings: AST transforms

val moo = newProperty(10)
val foo = bind { Math.sqrt(moo()) }
val button = new Button().set(
  text = bind { s"Foo is ${foo()}" },
  cancelButton = true
)
val moo = new SimpleIntegerProperty(10)
val foo = new DoubleBinding() {
  super.bind(moo)
  override def computeValue() =
    Math.sqrt(moo.get)
}
val button = new Button()
button.textProperty.bind(
  new StringBinding() {
    super.bind(foo)
    override def computeValue() =
      s"Foo is ${foo.get}"
  }
),
button.setCancelButton(true)

Another example of crazy DSL

val future = async {
  val f1 = async { ...; true }
  val f2 = async { ...; 42 }
  if (await(f1)) await(f2) else 0
}

(heir to the defunct cps plugin)

Better looking it is!

 

  • Macro extensions (structural types-friendly)
     
  • Syntactic sugar with macro impl of Dynamic
     
  • Flexible typeclass evidences (incl. negative)
     
  • Crazy DSLs (data binding, async...)

What about Faster?

For loops desugaring

for loop desugaring

for (i <- 0 to n)
  println(i)
 
     ⬇

scala.Predef.intWrapper(0).to(n).foreach(i => {
 println(i)
})

simple?


  map,

  flatMap,

  filter ?
for (i <- 0 to n;
     ii = i * i;
     j <- i to n;
     jj = j * j;
     if (ii - jj) % 2 == 0;
     k <- (i + j) to n)
  yield ii * jj + k

simple...
ಠ_ಠ

 

(0 to n).map(i => {
  val ii = i * i
  (i, ii)
}).flatMap(_ match {
  case (i, ii) =>
    (i to n).map(j => {
      val jj = j * j
      (j, jj)
    }).withFilter(_ match {
      case (j, jj) =>
        ii - jj % 2 == 0
    }).flatMap(_ match {
      case (j, jj) =>
        ((i + j) to n)
          .map(k => ii * jj + k)
   })
})
for (i <- 0 to n;
     ii = i * i;
     j <- i to n;
     jj = j * j;
     if (ii - jj) % 2 == 0;
     k <- (i + j) to n)
  yield ii * jj + k

while loops : 10x faster
ಡ_ಡ

(0 to n).map(i => {
  val ii = i * i
  (i, ii)
}).flatMap(_ match {
  case (i, ii) =>
    (i to n).map(j => {
      val jj = j * j
      (j, jj)
    }).withFilter(_ match {
      case (j, jj) =>
        ii - jj % 2 == 0
    }).flatMap(_ match {
      case (j, jj) =>
        ((i + j) to n)
          .map(k => ii * jj + k)
   })
})
val out = VectorBuilder[Int]()
var i = 0
while (i <= n) {
 val ii = i * i
 var j = i
 while (j <= n) {
   val jj = j * j
   if ((ii - jj) % 2 == 0) {
     var k = i + j
     while (k <= n) {
       out += (ii * jj + k)
       k += 1
     }
   }
   j += 1
 }
 i += 1
}

iterators only help so much

(0 until n).toIterator
    .filter(v => (v % 2) == 0)
    .map(_ * 2)
    .forall(x => x < n / 2)

While loop equivalent:

  • 5-20x faster with .toIterator
  • 50x faster without .toIterator

Why are Scala loops slow?

 

  • Frequent bad pattern: intermediate collections
  • Megamorphic foreach, map...
    • The JIT has other worries
  • Closure elimination → not enough
  • Inlining → stops when too big, still allocates Range

 

Scala team is not into ad-hoc optimizations

scala benchmarks != idiomatic

Scalaxy/Streams
macros to the rescue!

 

  • Loop fusion (incl. nested)
  • Drop unnecessary tuples
  • Detect side-effects

faster code + happier GC

array.zipWithIndex
     .filter(_._2 % 2 == 0)
     .map({ case (v, i) => v + i })
     .sum

                ⬇

{
  val length = array.length
  var i = 0
  var sum = 0
  while (i < length) {
    val item = array(i)
    if (i % 2 == 0) {
      val mapped = item + i
      sum += mapped
    }
  }
  sum
}
  • no intermediate collections
     
  • no intermediate tuples
    (unless they're materialized)
     
  • less closures / classes

Streams = macro || plugin

  • Local optimize { ... } macro
     
  • Ubiquitous plugin
     
  • Shared code / cake pattern
     
  • Tests  runtime universe
trait SomeFunctionality
{
  val global: scala.reflect.api.Universe
  import global._

  ...
}

class MyComponent(val global: Global)
    extends PluginComponent
    with SomeFunctionality {
  import global._

  ...
}
def optimize[A](a: A): A = macro optimizeImpl[A]

def optimizeImpl[A](c: Context)(...) = {
  object Runner extends SomeFunctionality {
    val global = c.universe
    import global._

    val result = ...
  }
  Runner.result.asInstanceOf[c.universe.Tree]
}

Composing code at runtime?

  • Retain AST at runtime
     
  • Compose / inline captures
     
  • Compile final composition
    only if needed (overhead)
     
  • Small code → big speedups
    (examples: x5)
val f = reified { (x: Int) => x * 0.15 }
val g = reified { (x: Int) => x + f(x) }

                  ⬇ g.compile()()

(x: Int) => {
  @inline def f(x: Int) = x * 0.15
  x + f(x)
}
case class Reified[A](value: A, expr: Expr[A]) {
  def compile(): () => A = ...
}

implicit def reified[A](value: A): Reified[A] =
    macro ...

  • Numeric on steroids
    (any method, any class!)
     
  • Erased away by Reified
     
  • C++ templates-style specialization / speed

    (can still constrain types with magic typeclasses)
def algo[A : Generic : TypeTag]: Reified[A => Double] =
  reified {
    (value: A) => {
      var a = value + one[A]
      a = a + number[A](10)
      a = a * number[A](2)
      a = a / number[A](3)
      a.toDouble
    }
  }

           ⬇ algo[Int].compile()()

      (value: Int) => {
        var a = value + 1
        a = a + 10
        a = a * 2
        a = a / 3
        a.toDouble
      }

ScalaCL = Scala on GPUs
(& CPUs !!!)

 

  • OpenCL: dynamic compilation / execution of C
    • on GPU + VRAM or CPU + RAM
    • Parallelism: explicit (1D, 2D...) & implicit (SIMD)
       
  • Scala → C transpilation
    • Small executable memory (collection → while)
    • ​Tuples flattening
    • Store arrays of tuploids as fibers
    • Code reuse / composition
      • Mix of compile-time / runtime

Canonical example

import scalacl._

case class Matrix(data: CLArray[Float], rows: Int, columns: Int)
                 (implicit context: Context)
{
  def putProduct(a: Matrix, b: Matrix): Unit = {
    kernel {
      for (i <- 0 until rows;
           j <- 0 until columns) {
        data(i * columns + j) =
          (0 until a.columns)
            .map(k => a.data(i * a.columns + k) * b.data(k * b.columns + j))
            .sum
      }
    }
  }
}

→ OpenCL kernel

kernel void putProduct(
    global float *data, int columns,
    global const float *a, int a_columns,
    global const float *b, int b_columns) {

  int i = get_global_id(0);
  int j = get_global_id(1);
  
  int sum = 0;
  var k = 0;
  while (k < a_columns) {
    sum += a[i * a_columns + k] * b[k * b_columns + j];
    k++;
  }
  data[i * columns + j] = sum;
}

Can use Scalaxy/Generic to abstract Float away
→ generate kernel at runtime

 Not only
Better & Faster...
Also Stronger?

case class Foo(bar: Int, baz: Int)

Foo(1, 2)               // Ambiguous: should name 1-2 args

Foo(baz, bar)           // Swapped args?

val Foo(baz, bar) = foo // Swapped aliases?

(current impl = POC quality)

// Renormalizing DSL, no runtime parsing
json"{ x: $a, y: $b }" → new JSONObject("x" -> a, "y" -> b)

// Parsing errors retrofitted as Scala errors:
json"{ x: $x, y': $y }"
               |
               |
               error: Unexpected character ('''): was expecting a colon...

// JSON extractors (still to optimize)
val json"{ x: $x, y: $y }" = obj

Similar approach in latest ScalaCL:
compile inline kernel source & report errors

<insert your own static checks and policy enforcement>

So...

  • More eye-candy / magic
  • More performance
  • More typo-safety

your own macros:
what could go wrong?

  • Can't use macros defined in same build
  • Matching Tree shape vs. Types / Symbols
  • Altering Symbol hierarchy
  • Untyping = dangerous; retype / pretype instead
  • Synthetic anonymous classes vs. sharing
  • Coexisting with misbehaved macros
  • Future of macros? (scala.metapaulp/policy)
    • Stick to what's the least "experimental"
    • Compiler plugins haven't changed much in 5 years :-)

What will your next macro be about?

If you're unsure, I could use your help :-D

Links

My medium-term roadmap

  • Scala in my day job
     
  • General 1. release of Scalaxy/Streams (current = 0.3.4)
    • Already compiles Scala mostly fine
       
  • Productionize Scalaxy/Reified (no untyping!)
     
  • Update & optimize ScalaCL
     
  • Factor flat arrays from ScalaCL (Scalaxy/Records)

Some Possibilities

  • Create types
    • Serializers
    • Database table objects (whitebox macros)
  • Resolve / fail implicits
  • Alter ASTs at compile-time & runtime
    • Optimizations
    • Debug instrumentation
  • Crazy DSLs
    • Host langages in String interpolations
    • applyDynamicNamed tricks

ScalaMacros

By Olivier Chafik