Parsing out

a Good Parser

Parsing Parsers Outline

  • Why Parse and What is a Parser

  • Ways to Parse and What we will Parse

  • Attempt #1 Regroup and Try Again #2

  • Try #3 has got to work

  • Never Give Up the Ghost #4

  • What Works and What to Watch Out for

Why Parse

  • people cannot understand binary data
  • computers cannot understand people
  • a parser translates from 'people' to 'computer'
  • people are less than perfect in writing text
  • computers require structural perfection

What is a Parser

  • program to convert text into binary data
  • but not any text, the text has to follow a grammar
  • a grammar is a set of rules (we'll code these up soon)
  • the binary data is also called an Abstract Syntax Tree  (AST)

Parts of a Parser

  • Lexer (optional)

  • A parser generator (grammar compiler) OR a runtime library

  • Actions: functions that translate to the  Abstract Syntax Tree  (AST) elements of your choice

All the Ways to Parse

  • regular expressions (v)
  • parser generators using PEG files
  • hand coded (recursive decent) parsers
  • parser combinators
  • others

(techniques)

What to Parse

  • identifiers
  • strings "in quotes"
  • lists (of identifiers, strings and of course nested lists)
  • comments

a modest wish list

Test

Driven

Demo

https://github.com/nmorse/set-parsers-to-stun

canopy

parser generator for languages [python java javascript ruby]

 

nearley

generates javascript 

also generates RR diagrams

a hand coded parser

with the help of xState (a finite state machine lib)

arcsecond

A set of parser "combinators"

(functions that take other functions as arguments and return (yes) new functions)


Compose them (combine them) into a parser

canopy nearley hand code arcsecond
learning curve +1 +3 -2 -1
features +1 +3 0 +3
friendly errors 0 0 +1 +1
following -1 +3 0 +2
bottom line +1 +9 -1 +5

What Works and What to Watch Out for

Thank You

https://github.com/nmorse/set-parsers-to-stun

https://nearley.js.org/

https://xstate.js.org/viz/

https://github.com/francisrstokes/arcsecond

Parsing all the

By Nate Morse

Parsing all the

  • 502