LET'S MAKE A DSL!

https://slides.com/davesters/lets-make-a-dsl

What is a DSL?

"A domain specific language is a computer language specialized to a particular application domain."

                                                                   - Wikipedia

I ain't never heard of no DSL

  • HTML
  • XML
  • CSS
  • SQL
  • Regular Expressions
  • Dockerfile
  • Bash scripts?
  • Make files
  • Excel spreadsheet formulas
  • PostScript

WRONG

* Because you program

*

All good things come in pairs

External

Internal

= the codes

Internal DSLs

Written in a host language

  • Design Pattern

  • Fluent Interfaces (LINQ)

var myNewDo = faves.Where(f => f == "Justin Beiber")
    .Select(jb => jb.HairStyle);
goodSodas.should.not.contain('Pepsi');
  • Language Extensions (sweet.js)

  • Language Reductions (JSON)

  • Unit test assertions

Why do I need one?

  • Can't express in existing language

  • Needs to be written by newbs

  • You're a special snowflake

  • You got a hankerin'

I made CSS

* It was actually another person, but for a second you thought you were in the prescene of a genius

*

Let's make something!

YOU HAD ME AT HTML

THE PROBLEM

Your company needs to test so many JSON API responses and the people that need to write the tests are not coders.

Testing JSON API responses

Sample Code

test "https://api.github.com/users/davesters" {

  should be object
  size should equal 30

  fields {
    "login" should be string
    id should equal 439674
    name should equal "David Corona"
    site_admin should not equal true
  }
}

1. Lexical Analysis

A fancy name for the thing that converts your code into things that mean something

00110111 00110100 00110110 00110101 00110111 00110011 00110111
00110100 00110010 00110000 00110010 00110010 00110110 00111000
00110111 00110100 00110111 00110100 00110111 00110000 00110111
00110011 00110011 01100001 00110010 01100110 00110010 01100110
00110110 00110001 00110111 00110000 00110110 00111001 00110010
01100101 00110110 00110111 00110110 00111001 00110111 00110100
00110110 00111000 00110111 00110101 00110110 00110010 00110010
01100101 00110110 00110011 00110110 01100110 00110110 01100100
00110010 01100110 00110111 00110101 00110111 00110011 00110110
00110101 00110111 00110010 00110111 00110011 00110010 01100110
00110110 00110100 00110110 00110001 00110111 00110110 00110110
00110101 00110111 00110011 00110111 00110100 00110110 00110101
00110111 00110010 00110111 00110011 00110010 00110010 00110010
00110000 00110111 01100010 01100001 01100001 00110010 00110000 

Imagine how the computer sees our code

1. Lexical Analysis

Convert a stream of characters into "tokens"

count should equal 938

Sort of like a representation of a found entity in the codes

{
    "type": "id",
    "literal": "count",
    "line": 5,
    "char": 0
}
{
    "type": "kwd",
    "literal": "should",
    "line": 5,
    "char": 6
}
{
    "type": "kwd",
    "literal": "equal",
    "line": 5,
    "char": 13
}
{
    "type": "num",
    "literal": 938,
    "line": 5,
    "char": 19
}

Define the Language

test
fields
size
each
object
array
string
number
boolean
true
false
should
not
be
equal
Keywords
Z => S
S => test t { T }
T => fields F { T }
T => each F { T }
T => A should BCE
T => e
A => size | i | t | e
B => not | e
C => be | equal
E => object | array | string        | number | boolean | true | false | t
F => i | t | e
Productions

i = identifier ,   t = literal ,   e = null

Making a Lexer

Basically like a big state machine


//[ i   d   {   }   "   .   -   sp  \n ]
  [ 2,  4,  12, 13, 6,  0,  11, 1,  1  ], // 1:  Starting State
  [ 2,  2,  3,  3,  3,  2,  2,  3,  3  ], // 2:  In Identifier
  [ 1,  1,  1,  1,  1,  1,  1,  1,  1  ], // 3:  End Identifier *
  [ 5,  4,  5,  5,  5,  8,  5,  5,  5  ], // 4:  In Number
  [ 1,  1,  1,  1,  1,  1,  1,  1,  1  ], // 5:  End Number *
  [ 6,  6,  6,  6,  7,  6,  6,  6,  6  ], // 6:  In String
  [ 1,  1,  1,  1,  1,  1,  1,  1,  1  ], // 7:  End String *
  [ 0,  9,  0,  0,  0,  0,  0,  0,  0  ], // 8:  Found Decimal Point
  [ 10, 9,  10, 10, 10, 10, 10, 10, 10 ], // 9:  In Decimal
  [ 1,  1,  1,  1,  1,  1,  1,  1,  1  ], // 10: End Decimal *
  [ 0,  4,  0,  0,  0,  0,  0,  0,  0  ], // 11: Found minus sign
  [ 1,  1,  1,  1,  1,  1,  1,  1,  1  ], // 12: Found Start Block *
  [ 1,  1,  1,  1,  1,  1,  1,  1,  1  ], // 13: Found End Block *

State machine?

A machine that can be in one of a number of states depending on its current state and inputs

Soda machine

idle

5¢

10¢

Over

5¢

5¢

10¢

Soda

Soda

Coin

Coin

10¢

*

*

The Lexer code

function lexer(input) {
    loop over characters {
        currentChar = input[pointer];
        column = getStateTableColumn(currentChar);
    
        if (column < 0) throw new LexerException('invalid input');
    
        state = getState(currentState, column);
    
        switch (state) {
            case 5: // Found a number
                addToken('num', currentToken, backtrack: true);
            case 7: // Found a string
                addToken('str', currentToken);
            ...
            default:
                currentToken += currentChar;
        }
        pointer++;
    }
    return tokens;
}

2. Syntactic Analysis

A fancy name for the thing that makes sure the tokens conform to the rules of your language grammar

  • Top-down Parsers
    • LL Parser
    • Recursive-Descent Parser (simplest?)
  • Bottom-up Parsers
    • LR parser
    • LALR parser (most common)
  • Other fancy ones

Lots of different types of Parsers

Recursive-descent Parser

A set of recursive functions where each function closely models a production of the grammar

Basically it keeps trying things until they don't work and it throws an error, or it gets to the end

Z => S
S => test t { T }
T => fields F { T }
T => each F { T }
T => A should BCE

S => test t { T }

function S(tokens) {
    let token = tokens.next();
    if (!token.isKeyword('test')) throw Exception('Expected keyword `test`');

    token = tokens.next();
    if (!token.isString()) throw Exception('Expected string');

    token = tokens.next();
    if (!token.isOpeningBrace()) throw Exception('Expected token `{`');

    let node = {
        type: 'test',
        left = { type: 'str', literal: token.literal },
        right = { type: 'block', children: [] }
    };

    loop {
        node.right.children.push( T(tokens) );

        if (token.isClosingBrace( tokens.next() )) break;
        tokens.backup();
    }

    return node;
}

T => A should BCE

function T(tokens) {
    // Code to handle other T productions...

    let node = {
        type: 'should',
        right: { children: [] }
    };

    let token = tokens.next();
    let a = A(token);
    if (a) {
        node.left = a;
        token = tokens.next();
    }

    if (!token.isKeyword('should')) throw Exception('Expected `should`');

    let b = B(token);
    if (b) node.right.children.push(b);

    node.right.children.push( C(tokens) );
    node.right.children.push( E(tokens) );

    return node;
}

C => be | equal

function C(tokens) {
    let token = tokens.next();

    if (token.isKeyword('be')) {
        return {
            type: 'kwd',
            literal: 'be'
        };
    }

    if (token.isKeyword('equal')) {
        return {
            type: 'kwd',
            literal: 'equal'
        };
    }

    throw Exception('Expected keyword `be` or `equal`');
}

Parse Tree

Test

Now we have a complete parse tree or kind of an "AST"

Should

Should

Should

name

be

string

bob

protip: can save parse tree to a file as JSON

3. Interpreter Code

The completely ugly non-sexy part of the code

  • Recursively iterate over parse tree
  • May have set of functions similar to parser
  • Do something based on the nodes
  • We can assume all is good with the tree
  • Only ~200 lines of code for mine

DEMO

Now I can make one!

or, Why you should not make a new DSL...

Chances are your problem has already been solved

However, they can be fun to make and useful

Thanks

Sample code:

https://github.com/davesters/js-dsl-example

 

Slides:

https://slides.com/davesters/lets-make-a-dsl

 

Contact:

Twitter - @davesters

Github - davesters

Website - http://www.lovesmesomecode.com

Let's make a DSL!

By David Corona