2017 @phenomnominal

I'm Craig

I do JavaScript at

You can find me on the Twitters

@phenomnominal

2017 @phenomnominal

First, a STORY...

2017 @phenomnominal

you're a wizard!

nz.js(con) attendee

2017 @phenomnominal

You went to hogwarts!

2017 @phenomnominal

YOU KNOW Dumbledore!

2017 @phenomnominal

AND HARRY POTTER!

2017 @phenomnominal

But there's a problem!

2017 @phenomnominal

VOLDEMORT HAS PUT A CURSE ON HARRY!

2017 @phenomnominal

(this is 100% canon, don't look it up)

2017 @phenomnominal

But! Harry speaks Parseltongue.

2017 @phenomnominal

Which just so happens to be a turing-complete programming language!

(again, there's definitely no need to look this up)

2017 @phenomnominal

PARSELTONGUE 🐍🐍🐍:

Parseltongue is a pretty simple language, but very hard to read!

Variables:

sssHelloWorld <~ 'hello world'
sssFive <~ 5
sssBool <~ true
sssMultiply [sssA, sssB, sssC]
    <~ sssA * sssB * sssC

Functions:

Calls:

sssProduct <~ sssMultiply <~ [3, 4, 5]
var helloWorld = 'hello world';
var five = 5;
var bool = true;
var multiply = function (a, b, c) {
    return a * b * c;
}
var product = multiply(3, 4, 5);

Parseltongue

JavaScript

2017 @phenomnominal

PARSELTONGUE 🐍🐍🐍:

Control flow:

ss something
    sssDoSomething
ssssss somethingElse
    sssDoSomethingElse
ssss
    sssDoDefault
if (something) {
    doSomething();
} else if (somethingElse) {
    doSomethingElse();
} else {
    doDefault();
}

Parseltongue

JavaScript

Loops:

sssStart <- 5
sssEnd <~ 10
sssss sssI <~ sssStart ~> sssEnd
    sssDoSomething sssI
for (var i = start; i < end; i++) {
    doSomething(i);
}

Thankfully, Parseltongue has a type system that is identical to JavaScript

2017 @phenomnominal

PARSELTONGUE 🐍🐍🐍:

Harry uses Parseltongue to write complex spells to try to break Voldemort's curse:

sssSpell <~ 'Expecto Patronum'

sssMagic [sssSpell, sssIntensity]
    sssIntense <~ ''
    sssss sssI <~ 0 ~> sssIntensity
       sssIntense <~ sssIntense + '!'
    <~ sssSpell + sssIntense

sssss sssI <~ 0 ~> 10
    sssMagic <~ [sssSpell, sssI]
var spell = 'Expecto Patronum';

var magic = function (spell, intensity) {
    var intense = '';
    for (var i = 0; i < intensity; i++) {
       intense = intense + '!';
    }
    return spell + intense;
}

for (var i = 0; i < 10; i++) {
    magic(spell, i);
}

Parseltongue

JavaScript

2017 @phenomnominal

2017 @phenomnominal

WE NEED TO COME UP WITH A WAY TO TRANSLATE Harry's Parseltongue into JavaScript so he Can escape the internet!

2017 @phenomnominal

Parseltongue

JavaScript

???

Me too Hermione, me too.

2017 @phenomnominal

Parseltongue

JavaScript

REGEX!

2017 @phenomnominal

First ATTEMPT:

Let's just try converting a line at a time:

let script = await fs.readFileAsync(scriptPath)
let lines = script.toString().split(/\n/);
[
    'sssSpell <~ \'Expecto Patronum\'',
    '',
    'sssMagic [sssSpell, sssIntensity]',
    '    sssIntense <~ \'\'',
    '    sssss sssI <~ 0 ~> sssIntensity',
    '        sssIntense <~ sssIntense + '!'',
    '    <~ sssSpell + sssIntense',
    '',
    'sssss sssI <~ 0 ~> 10',
    '    sssMagic <~ [sssSpell, sssI]'
]

That gives us:

2017 @phenomnominal

Converting Variables:

function findVariableAssignment (line) {
    // Match on groups of 4 spaces
    // And `sss` followed by any alphabet characters (the name)
    // And `<~`
    // And then anything after it (the value)...
    let result = line.match(/^( {4})*sss([a-zA-Z]*) <~ (.*)/);
    
    // If we get a match...
    if (result) {
        let [, indents, name, value] = result;
        // Manually write out the equivalent JS...
        return `var ${lowerCaseFirst(name)} = ${value};`;
    }
    return null;
}

That's not tooooo bad...

A NaiVE Approach:

2017 @phenomnominal

Converting LOOPS:

function findWhileLoop (line) {
    // Match on groups of 4 spaces
    // And `sssss`
    // And `sss` followed by any alphabet characters (the indexer)
    // And `<~`
    // And anything (the from)
    // And `~>`
    // And anything (the to)
    let whileLoop = /^( {4})*sssss sss([a-zA-z]+) <~ (.*) ~> (.*)/;
    let result = line.match(whileLoop);

    // If we get a match...
    if (result) {
        let [, indents, indexer, from, to] = result;
        let i = lowerCaseFirst(indexer);
        // Manually write out the equivalent JS...
        return `for (var ${i} = ${from}; i < ${to}; i += 1) {`;
    }
    return null;
}

Hmmm...

A NaiVE Approach:

2017 @phenomnominal

function findFunctionDeclaration (line) {
    // Match on groups of 4 spaces
    // And `sss` followed by any alphabet characters (the name)
    // And [
    // And comma-separated `sss` followed by alphabet characters
    // And ]...
    let fdr = /^( {4})*sss([a-zA-z]+) \[(?:sss([a-zA-Z]+), )*sss([a-zA-z]+)\]/;
    let result = line.match(fdr);

    // If we get a match...
    if (result) {
        let [, indents, name, ...parameters] = result;
        // Manually write out the equivalent JS...
        let params = parameters.join(', ');
        return `function ${lowerCaseFirst(name)} (${params}) {`;
    }
    return null;
}

A NaiVE Approach:

Converting Functions:

Uh oh...

2017 @phenomnominal

2017 @phenomnominal

What are we really trying to do?

2017 @phenomnominal

Translate from one language to another

TRANSFIGURATION!

AKA TRANSPILING

2017 @phenomnominal

"Transpiling is a specific term for taking source code written in one language and transforming into another language that has a similar level of abstraction"

The first result on google says:

2017 @phenomnominal

Let's work backwards...


 magic('Wingardium leviosa');

String literal

Function call

Identifier

Expression

2017 @phenomnominal

ESTREE

{
    "body": [{
        "type": "ExpressionStatement",
        "expression": {
            "type": "CallExpression",
            "callee": {
                "type": "Identifier",
                "name": "magic"
            },
            "arguments": [{
                "type": "Literal",
                "value": "Wingardium leviosa"
            }]
        }
    }]
}

 magic('Wingardium leviosa');

JavaScript:

ESTree structure:

2017 @phenomnominal

Esprima

import { parse } from 'esprima';

let AST = parse(`
    magic('Wingardium leviosa');
`);

2017 @phenomnominal

What is an AST?

Β Abstract

Β  disassociated from any specific instance

Β  the way in which linguisticΒ elements are put together

Β Syntax

Β TREE

Β  a data structure made up of vertices and edges without having any cycles

2017 @phenomnominal

WaT.

2017 @phenomnominal

a data structure that represents the structure of code, without any actual syntax.


magic('Wingardium leviosa');

An AST is...

{
    "body": [{
        "type": "ExpressionStatement",
        "expression": {
            "type": "CallExpression",
            "callee": {
                "type": "Identifier",
                "name": "magic"
            },
            "arguments": [{
                "type": "Literal",
                "value": "Wingardium leviosa"
            }]
        }
    }]
}

Code:

AST:


(magic "Wingardium leviosa")
sssMagic <~ 'Wingardium leviosa'

2017 @phenomnominal

Which means...

2017 @phenomnominal

IF WE CAN GET FROM PARSELTONGUE TO AN AST...

2017 @phenomnominal

Β THEN WE CAN GO FROM THAT AST TO JAVASCRIPT!

PARSELTONGUE

JavaScript

AST

???

2017 @phenomnominal

???

HOW DO WE DO THAT?

2017 @phenomnominal

Our naive approach was a little too naive...

2017 @phenomnominal

Let's try something a bit more robust

2017 @phenomnominal

2017 @phenomnominal

PARSELTONGUE

JavaScript

TOKENS

LEXING

???

???

AST

lexing/HEXING

Lexing is the process of breaking down source code into words that are relevant to the language, which are called tokens.

sssSpell <~ 'Expecto Patronum'

Identifier

Whitespace

Punctuator

Whitespace

String literal

2017 @phenomnominal

[
 { type: 'identifier', value: 'sssSpell', from: 0, to: 8 },
 { type: 'space', value: ' ', from: 8, to: 9 },
 { type: 'punctuator', value: '<~', from: 9, to: 11 },
 { type: 'space', value: ' ', from: 11, to: 12 },
 { type: 'stringLiteral', value: '\'Expecto Patronum\'', from: 12, to: 30 },
 { type: 'lineTerminator', value: '\n\n', from: 30, to: 32 },
 { type: 'identifier', value: 'sssMagic', from: 32, to: 40 },
 { type: 'space', value: ' ', from: 40, to: 41 },
 { type: 'punctuator', value: '[', from: 41, to: 42 },
 { type: 'identifier', value: 'sssSpell', from: 42, to: 50 },
 { type: 'punctuator', value: ',', from: 50, to: 51 },
 { type: 'space', value: ' ', from: 51, to: 52 },
 { type: 'identifier', value: 'sssIntensity', from: 52, to: 64 },
 { type: 'punctuator', value: ']', from: 64, to: 65 },
 { type: 'lineTerminator', value: '\n', from: 65, to: 66 },
 { type: 'indent', value: '    ', from: 66, to: 70 },
 { type: 'identifier', value: 'sssIntense', from: 70, to: 80 },
 { type: 'space', value: ' ', from: 80, to: 81 },
 { type: 'punctuator', value: '<~', from: 81, to: 83 },
 { type: 'space', value: ' ', from: 83, to: 84 },
 { type: 'stringLiteral', value: '\'\'', from: 84, to: 86 },
 { type: 'lineTerminator', value: '\n', from: 86, to: 87 },
 { type: 'indent', value: '    ', from: 87, to: 91 },
 { type: 'keyword', value: 'sssss', from: 91, to: 96 },
 { type: 'space', value: ' ', from: 96, to: 97 },
 { type: 'identifier', value: 'sssI', from: 97, to: 101 },
 { type: 'space', value: ' ', from: 101, to: 102 },
 { type: 'punctuator', value: '<~', from: 102, to: 104 },
 { type: 'space', value: ' ', from: 104, to: 105 },
 { type: 'numericLiteral', value: '0', from: 105, to: 106 },
 { type: 'space', value: ' ', from: 106, to: 107 },
 // ...
sssSpell <~ 'Expecto Patronum'

sssMagic [sssSpell, sssIntensity]
    sssIntense <~ ''
    sssss sssI <~ 0 ~> sssIntensity
       sssIntense <~ sssIntense + '!'
    <~ sssSpell + sssIntense

sssss sssI <~ 0 ~> 10
    sssMagic <~ [sssSpell, sssI]

2017 @phenomnominal

PARSELTONGUE

JavaScript

TOKENS

LEXING

2017 @phenomnominal

PARSING

???

AST

Parsing

Parsing is the process of taking the lexical tokens and applying the grammar of a language to them.

2017 @phenomnominal

Variable Declaration

{ 
  type: 'Program',
  body: [{
    type: 'VariableDeclaration',
    declarations: [{ 
      type: 'VariableDeclarator',
      id: {
        type: 'Identifier',
        name: 'spell'
      },
      init: {
        type: 'Literal',
        value: 'Expecto patronum'
      }
    }]
  }]
}
sssSpell <~ 'Expecto patronum'

Identifier

Whitespace

Punctuator

Whitespace

String literal

2017 @phenomnominal

Function Declaration

sssMagic [sssSpell, sssIntensity]
    ...
{
  type: 'VariableDeclarator',
  id: {
    type: 'Identifier',
    name: 'magic
  },
  init: {
    type: 'FunctionExpression',
    params: [{
      type: 'Identifier',
      name: 'spell
  }, {
      type: 'Identifier',
      name: 'intensity 
  }],
  body: {
    type: 'BlockStatement',
    body: [{ /// }]
  }
}

Identifier

Whitespace

Punctuator

Identifier

Punctuator

Whitespace

Identifier

Punctuator

2017 @phenomnominal

'{"type":"Program","body":[{"type":"VariableDeclaration","declarations":[{"type":"VariableDeclarator","id":{"type":"Identifier","name" :"spell"},"init":{"type":"Literal","value":"Expecto Patronum","raw":"\'Expecto Patronum\'"}}],"kind":"var"},{"type":"VariableDeclaration" ,"declarations":[{"type":"VariableDeclarator","id":{"type":"Identifier","name":"magic"},"init":{"type":"FunctionExpression","id":null, "params":[{"type":"Identifier","name":"spell"},{"type":"Identifier","name":"intensity"}],"body":{"type":"BlockStatement","body":[{"type" :"VariableDeclaration","declarations":[{"type":"VariableDeclarator","id":{"type":"Identifier","name":"intense"},"init":{"type":"Literal" ,"value":"","raw":"\'\'"}}],"kind":"var"},{"type":"ForStatement","init":{"type":"VariableDeclaration","declarations":[{"type": "VariableDeclarator","id":{"type":"Identifier","name":"i"},"init":{"type":"Literal","value":0,"raw":"0"}}],"kind":"var"},"test":{"type" :"BinaryExpression","operator":"<","left":{"type":"Identifier","name":"i"},"right":{"type":"Identifier","name":"intensity"}},"update":{"type":"UpdateExpression","operator":"++","argument":{"type":"Identifier","name":"i"},"prefix":false},"body":{"type":"BlockStatement" ,"body":[{"type":"ExpressionStatement","expression":{"type":"AssignmentExpression","operator":"=","left":{"type":"Identifier","name": "intense"},"right":{"type":"BinaryExpression","operator":"+","left":{"type":"Identifier","name":"intense"},"right":{"type":"Literal", "value":"!","raw":"\'!\'"}}}}]}},{"type":"ReturnStatement","argument":{"type":"BinaryExpression","operator":"+","left":{"type": "Identifier","name":"spell"},"right":{"type":"Identifier","name":"intense"}}}]},"generator":false,"expression":false}}],"kind":"var"},{"type":"ForStatement","init":{"type":"VariableDeclaration","declarations":[{"type":"VariableDeclarator","id":{"type":"Identifier", "name":"i"},"init":{"type":"Literal","value":0,"raw":"0"}}],"kind":"var"},"test":{"type":"BinaryExpression","operator":"<","left":{"type":"Identifier","name":"i"},"right":{"type":"Literal","value":10,"raw":"10"}},"update":{"type":"UpdateExpression","operator":"++" ,"argument":{"type":"Identifier","name":"i"},"prefix":false},"body":{"type":"BlockStatement","body":[{"type":"ExpressionStatement", "expression":{"type":"CallExpression","callee":{"type":"Identifier","name":"magic"},"arguments":[{"type":"Identifier","name":"spell"},{"type":"Identifier","name":"i"}]}}]}}],"sourceType":"script"}'

2017 @phenomnominal

Now we just need to go FROM ast to JavaScript

Which sounds pretty hard...

2017 @phenomnominal

ESCODEGEN

import * as escodegen from 'escodegen';

let javascript = escodegen.generate(ast);

2017 @phenomnominal

2017 @phenomnominal

PARSELTONGUE

JavaScript

TOKENS

LEXING

PARSING

CODEGEN

AST

2017 @phenomnominal

It's almost time to try and see if we can save harry!

2017 @phenomnominal

But OH NO!

2017 @phenomnominal

Sometimes, when a witch or wizard gets stuck in the internet too long, they go a bit delirious from all the gifs and kittens, and they may try to use one of the...

2017 @phenomnominal

Unforgivable FUNCTIONS

imperio()
crucio()
avadakedavra()

2017 @phenomnominal

Let's see if we can come up with a way to stop that from happening!

2017 @phenomnominal

Inspecting the AST

There's a number of reasons you might want to do this:

  • Code transforming/formatting
  • Linting
  • Minifying
  • Mutating

2017 @phenomnominal

ESQuery

import esquery from 'esquery';

let nodes = esquery.query(ast, myQuery);

2017 @phenomnominal

Linting for the unforgivable functions:

CallExpression[callee.name="imperio"] Identifier
CallExpression[callee.name="crucio"] Identifier
CallExpression[callee.name="avadakedavra"] Identifier
let nodes = esquery.query(ast, UNFORGIVABLE_QUERY);
nodes.forEach(identifier => identifier.name = 'alert');

2017 @phenomnominal

2017 @phenomnominal

PARSELTONGUE

JavaScript

TOKENS

LEXING

PARSING

CODEGEN

AST

2017 @phenomnominal

INSPECTION

Resources:

https://github.com/dannysu/ecmascript1

https://github.com/estree/estree

https://github.com/jquery/esprima

https://github.com/estools/escodegen

http://estools.github.io/esquery/

https://medium.com/@kosamari/how-to-be-a-compiler-make-a-compiler-with-javascript-4a8a13d473b4#.jb7hdpm7z

https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Parser_API

https://github.com/mozilla/source-map#generating-a-source-map

https://hacks.mozilla.org/2013/05/compiling-to-javascript-and-debugging-with-source-maps/

https://astexplorer.net

https://www.buzzfeed.com/zgalehouse/300-harry-potter-gifsthe-magic-never-ends-7sat?utm_term=.qk4pZBa22#.ktRkWQMPP

2017 @phenomnominal

Questions?

2017 @phenomnominal

Fantastic ASTs and Where To Find Them 2017

By Craig Spence

Fantastic ASTs and Where To Find Them 2017

  • 954
Loading comments...

More from Craig Spence