Better, Faster, Stronger
with macros & compiler plugins
What's up?
- Intro to macros (the good parts)
- Better (looking): macro-based DSLs
- Faster: ScalaCL, Scalaxy, Reified
- Stronger: Parano, JSON
- Gotchas
Awful slides full of code, sorry :-S
Macros?
What macros look like
import scala.reflect.macros.blackbox.Context
import scala.language.experimental.macros
object EmailChecker {
def check[A](a: A): A = macro checkImpl[A]
def checkImpl[A: c.WeakTypeTag](c: Context)(a: c.Expr[A]): c.Expr[A] = {
import c.universe._
new Traverser {
override def traverse(tree: Tree) = tree match {
case Literal(Constant(s: String)) if s.contains("@gmail.com") =>
c.warning(tree.pos, "Wait, did you leave an email here?")
case _ =>
super.traverse(tree)
}
} traverse (a.tree)
a
}
}
val x = EmailChecker.check {
val email = "olivier.chafik@gmail.com"
} |
|
^ WARNING: Wait, did you leave an email here?
Macros:
-
Traverse / transform AST with Quasiquotes
-
Def macros → experimental (see scala.meta)
-
Macro annotations → too experimental
-
But compiler plugins omnipotent + same APIs
-
-
Blackbox vs. Whitebox: fixed / open contract
How can it make our code look Better?
implicit class FooExt(foo: Foo) {
def bar = s"$foo bar"
}
new Foo().bar
⬇
new FooExt(new Foo()).bar
implicit case class FooExt(foo: Foo)
extends AnyVal {
def bar = s"$foo bar"
}
new Foo().bar
⬇
FooExt.bar$extension(new Foo())
- Value class trick doesn't work with implicit params
Macro equivalent
implicit class FooExt(foo: Foo) {
def bar: String = macro barImpl
}
def barImpl(c: Context): c.Expr[String] = {
import c.universe._
val q"new FooExt($foo)" = c.prefix
c.Expr[String](q""" s"$foo bar" """)
}
new Foo().bar
⬇
s"${new Foo()} bar"
implicit case class FooExt(foo: Foo)
extends AnyVal {
def bar = s"$foo bar"
}
new Foo().bar
⬇
FooExt.bar$extension(new Foo())
- Guaranteed inlining
- No runtime dep
rehabilitate 💕structural types!
import javafx.beans.property.BooleanProperty
implicit class AnimatedGetterAndSetter(o: { def animatedProperty(): BooleanProperty }) {
def animated: Boolean = macro impl.get[Boolean]
def animated_=(value: Boolean): Unit = macro impl.set[Boolean]
}
val axis: Axis = ...
axis.animated = true
⬇ typer
new AnimatedGetterAndSetter(axis).animated = true
⬇ macro expansion
axis.animatedProperty.setValue(true)
Implicit resolution with macros
trait HasAnnotation[T, A]
implicit def hasAnnotation[T, A]: HasAnnotation[T, A] = macro impl[T, A]
def impl[T : c.WeakTypeTag, A : c.WeakTypeTag]
(c: Context): c.Expr[HasAnnotation[T, A]] = {
import c.universe._
val found = weakTypeOf[T].typeSymbol.asType.annotations
.exists(a => a.tree.tpe <:< weakTypeOf[A])
if (!found) c.error(c.enclosingPosition, "annotation not found")
c.Expr[HasAnnotation[T, A]](q"null")
}
type IsEntity[T] = HasAnnotation[T, javax.persistence.Entity]
type IsNotDeprecated[T] = HasNoAnnotation[T, Deprecated]
def serialize[T <: Serializable : IsEntity : IsNotDeprecated](entity: T) = ???
implicit def beansExtensions[T](bean: T) = new {
def set = new Dynamic {
def applyDynamicNamed(name: String)(args: (String, Any)*): T =
macro impl.applyDynamicNamedImpl[T]
}
}
}
new Button().set(text = "Ok", scaleX = 0.5)
⬇ typer
beansExtensions(new Button()).set
.apply(text = "Ok", scaleX = 0.5)
⬇ "apply does not exist, but called on a Dynamic"
beansExtensions(new Button()).set
.applyDynamicNamed("apply")("text" -> "Ok",
"scaleX" -> 0.5)
⬇ macro expansion
{
val b = new Button()
b.setText("Ok")
b.setScaleX(0.5)
b
}
Scalaxy/Beans
(see post)
magic bindings: AST transforms
val moo = newProperty(10)
val foo = bind { Math.sqrt(moo()) }
val button = new Button().set(
text = bind { s"Foo is ${foo()}" },
cancelButton = true
)
val moo = new SimpleIntegerProperty(10)
val foo = new DoubleBinding() {
super.bind(moo)
override def computeValue() =
Math.sqrt(moo.get)
}
val button = new Button()
button.textProperty.bind(
new StringBinding() {
super.bind(foo)
override def computeValue() =
s"Foo is ${foo.get}"
}
),
button.setCancelButton(true)
- Scalaxy/Fx (this example)
- lihaoyi/scala.rx
Another example of crazy DSL
val future = async {
val f1 = async { ...; true }
val f2 = async { ...; 42 }
if (await(f1)) await(f2) else 0
}
(heir to the defunct cps plugin)
Better looking it is!
- Macro extensions (structural types-friendly)
- Syntactic sugar with macro impl of Dynamic
- Flexible typeclass evidences (incl. negative)
- Crazy DSLs (data binding, async...)
What about Faster?
For loops desugaring
for loop desugaring
for (i <- 0 to n)
println(i)
⬇
scala.Predef.intWrapper(0).to(n).foreach(i => {
println(i)
})
simple?
map,
flatMap,
filter ?
for (i <- 0 to n;
ii = i * i;
j <- i to n;
jj = j * j;
if (ii - jj) % 2 == 0;
k <- (i + j) to n)
yield ii * jj + k
simple...
ಠ_ಠ
(0 to n).map(i => {
val ii = i * i
(i, ii)
}).flatMap(_ match {
case (i, ii) =>
(i to n).map(j => {
val jj = j * j
(j, jj)
}).withFilter(_ match {
case (j, jj) =>
ii - jj % 2 == 0
}).flatMap(_ match {
case (j, jj) =>
((i + j) to n)
.map(k => ii * jj + k)
})
})
for (i <- 0 to n;
ii = i * i;
j <- i to n;
jj = j * j;
if (ii - jj) % 2 == 0;
k <- (i + j) to n)
yield ii * jj + k
while loops : 10x faster
ಡ_ಡ
(0 to n).map(i => {
val ii = i * i
(i, ii)
}).flatMap(_ match {
case (i, ii) =>
(i to n).map(j => {
val jj = j * j
(j, jj)
}).withFilter(_ match {
case (j, jj) =>
ii - jj % 2 == 0
}).flatMap(_ match {
case (j, jj) =>
((i + j) to n)
.map(k => ii * jj + k)
})
})
val out = VectorBuilder[Int]()
var i = 0
while (i <= n) {
val ii = i * i
var j = i
while (j <= n) {
val jj = j * j
if ((ii - jj) % 2 == 0) {
var k = i + j
while (k <= n) {
out += (ii * jj + k)
k += 1
}
}
j += 1
}
i += 1
}
iterators only help so much
(0 until n).toIterator
.filter(v => (v % 2) == 0)
.map(_ * 2)
.forall(x => x < n / 2)
While loop equivalent:
- 5-20x faster with .toIterator
- 50x faster without .toIterator
Why are Scala loops slow?
- Frequent bad pattern: intermediate collections
-
Megamorphic foreach, map...
- The JIT has other worries
- Closure elimination → not enough
- Inlining → stops when too big, still allocates Range
scala benchmarks != idiomatic
Scalaxy/Streams
macros to the rescue!
- Loop fusion (incl. nested)
- Drop unnecessary tuples
- Detect side-effects
faster code + happier GC
array.zipWithIndex
.filter(_._2 % 2 == 0)
.map({ case (v, i) => v + i })
.sum
⬇
{
val length = array.length
var i = 0
var sum = 0
while (i < length) {
val item = array(i)
if (i % 2 == 0) {
val mapped = item + i
sum += mapped
}
}
sum
}
- no intermediate collections
- no intermediate tuples
(unless they're materialized)
- less closures / classes
Streams = macro || plugin
- Local optimize { ... } macro
-
Ubiquitous plugin
- Shared code / cake pattern
- Tests → runtime universe
trait SomeFunctionality
{
val global: scala.reflect.api.Universe
import global._
...
}
class MyComponent(val global: Global)
extends PluginComponent
with SomeFunctionality {
import global._
...
}
def optimize[A](a: A): A = macro optimizeImpl[A]
def optimizeImpl[A](c: Context)(...) = {
object Runner extends SomeFunctionality {
val global = c.universe
import global._
val result = ...
}
Runner.result.asInstanceOf[c.universe.Tree]
}
Composing code at runtime?
- Retain AST at runtime
- Compose / inline captures
- Compile final composition
only if needed (overhead)
- Small code → big speedups
(examples: x5)
val f = reified { (x: Int) => x * 0.15 }
val g = reified { (x: Int) => x + f(x) }
⬇ g.compile()()
(x: Int) => {
@inline def f(x: Int) = x * 0.15
x + f(x)
}
case class Reified[A](value: A, expr: Expr[A]) {
def compile(): () => A = ...
}
implicit def reified[A](value: A): Reified[A] =
macro ...
- Numeric on steroids
(any method, any class!)
- Erased away by Reified
- C++ templates-style specialization / speed
(can still constrain types with magic typeclasses)
def algo[A : Generic : TypeTag]: Reified[A => Double] =
reified {
(value: A) => {
var a = value + one[A]
a = a + number[A](10)
a = a * number[A](2)
a = a / number[A](3)
a.toDouble
}
}
⬇ algo[Int].compile()()
(value: Int) => {
var a = value + 1
a = a + 10
a = a * 2
a = a / 3
a.toDouble
}
ScalaCL = Scala on GPUs
(& CPUs !!!)
-
OpenCL: dynamic compilation / execution of C
- on GPU + VRAM or CPU + RAM
-
Parallelism: explicit (1D, 2D...) & implicit (SIMD)
- Scala → C transpilation
- Small executable memory (collection → while)
- Tuples flattening
- Store arrays of tuploids as fibers
- Code reuse / composition
- Mix of compile-time / runtime
Canonical example
import scalacl._
case class Matrix(data: CLArray[Float], rows: Int, columns: Int)
(implicit context: Context)
{
def putProduct(a: Matrix, b: Matrix): Unit = {
kernel {
for (i <- 0 until rows;
j <- 0 until columns) {
data(i * columns + j) =
(0 until a.columns)
.map(k => a.data(i * a.columns + k) * b.data(k * b.columns + j))
.sum
}
}
}
}
→ OpenCL kernel
kernel void putProduct(
global float *data, int columns,
global const float *a, int a_columns,
global const float *b, int b_columns) {
int i = get_global_id(0);
int j = get_global_id(1);
int sum = 0;
var k = 0;
while (k < a_columns) {
sum += a[i * a_columns + k] * b[k * b_columns + j];
k++;
}
data[i * columns + j] = sum;
}
Can use Scalaxy/Generic to abstract Float away
→ generate kernel at runtime
Not only
Better & Faster...
Also Stronger?
case class Foo(bar: Int, baz: Int)
Foo(1, 2) // Ambiguous: should name 1-2 args
Foo(baz, bar) // Swapped args?
val Foo(baz, bar) = foo // Swapped aliases?
(current impl = POC quality)
// Renormalizing DSL, no runtime parsing
json"{ x: $a, y: $b }" → new JSONObject("x" -> a, "y" -> b)
// Parsing errors retrofitted as Scala errors:
json"{ x: $x, y': $y }"
|
|
error: Unexpected character ('''): was expecting a colon...
// JSON extractors (still to optimize)
val json"{ x: $x, y: $y }" = obj
Similar approach in latest ScalaCL:
compile inline kernel source & report errors
<insert your own static checks and policy enforcement>
So...
- More eye-candy / magic
- More performance
- More typo-safety
your own macros:
what could go wrong?
- Can't use macros defined in same build
- Matching Tree shape vs. Types / Symbols
- Altering Symbol hierarchy
- Untyping = dangerous; retype / pretype instead
- Synthetic anonymous classes vs. sharing
- Coexisting with misbehaved macros
- Future of macros? (scala.meta, paulp/policy)
- Stick to what's the least "experimental"
- Compiler plugins haven't changed much in 5 years :-)
What will your next macro be about?
If you're unsure, I could use your help :-D
Links
My medium-term roadmap
- Scala in my day job
- General 1. release of Scalaxy/Streams (current = 0.3.4)
- Already compiles Scala mostly fine
- Already compiles Scala mostly fine
- Productionize Scalaxy/Reified (no untyping!)
- Update & optimize ScalaCL
- Factor flat arrays from ScalaCL (Scalaxy/Records)
- Cache-aware arrays for fast processing
- LMAX Distruptor-style queues
Some Possibilities
- Create types
- Serializers
- Database table objects (whitebox macros)
- Resolve / fail implicits
- Alter ASTs at compile-time & runtime
- Optimizations
- Debug instrumentation
- Crazy DSLs
- Host langages in String interpolations
- applyDynamicNamed tricks
ScalaMacros
By Olivier Chafik
ScalaMacros
- 1,953