Parsing with Derivatives

A general way to parse context-free grammars

Folkert de Vries

November 27, 2018

The Brzozowski derivative

for regular languages

L = \{ foo, bar, baz \}

L = \{ foo, bar, baz \}

\begin{aligned} D_b(L) &= \{ ar, az \} \\ D_f(L) &= \{ oo \} \\ D_q(L) &= \emptyset \\ D_{foo}(L) &= \{ \epsilon \} \end{aligned}

\begin{aligned} D_b(L) &amp;= \{ ar, az \} \\ D_f(L) &amp;= \{ oo \} \\ D_q(L) &amp;= \emptyset \\ D_{foo}(L) &amp;= \{ \epsilon \} \end{aligned}

The Brzozowski derivative

on regular expressions

S \rightarrow (ab)^{*}

S \rightarrow (ab)^{*}

\begin{aligned} D_c(\emptyset) &= \emptyset \\ D_c(\epsilon) &= \emptyset \\ D_c(c) &= \epsilon \\ D_c(c') &= \emptyset \\ D_c(P^{*}) &= D_c(P) \circ P^{*} \\ D_c(P \cup S) &= D_c(P) \cup D_c(S) \\ D_c(P \circ S) &= D_c(P) \circ S \cup (\delta(P) \circ D_c(S)) \\ \end{aligned}

\begin{aligned} D_c(\emptyset) &amp;= \emptyset \\ D_c(\epsilon) &amp;= \emptyset \\ D_c(c) &amp;= \epsilon \\ D_c(c&#x27;) &amp;= \emptyset \\ D_c(P^{*}) &amp;= D_c(P) \circ P^{*} \\ D_c(P \cup S) &amp;= D_c(P) \cup D_c(S) \\ D_c(P \circ S) &amp;= D_c(P) \circ S \cup (\delta(P) \circ D_c(S)) \\ \end{aligned}

\(\delta\) is the nullability function that checks that \(\epsilon\) is in \(L\)

The Brzozowski derivative

on regular expressions

\begin{aligned} \delta(\emptyset) &= \emptyset \\ \delta(\epsilon) &= \{ \epsilon \}\\ \delta(c) &= \emptyset\\ \delta(P \cup S) &= \delta(P) \cup \delta(S)\\ \delta(P \circ S) &= \delta(P) \cap \delta(S)\\ \delta(P^*) &= \{ \epsilon \}\\ \end{aligned}

\begin{aligned} \delta(\emptyset) &amp;= \emptyset \\ \delta(\epsilon) &amp;= \{ \epsilon \}\\ \delta(c) &amp;= \emptyset\\ \delta(P \cup S) &amp;= \delta(P) \cup \delta(S)\\ \delta(P \circ S) &amp;= \delta(P) \cap \delta(S)\\ \delta(P^*) &amp;= \{ \epsilon \}\\ \end{aligned}

\(\delta\) is the nullability function that checks that \(\epsilon\) is in \(L\)

It uses that \(\emptyset \circ L = \emptyset\) and \(\epsilon \circ L = L\)

The Brzozowski derivative

on regular expressions

\begin{aligned} D_a(L) &= D_a((a \circ b)^{*}) \\ &= D_a(a \circ b) \circ (a \circ b)^{*} \\ &= D_a(a) \circ b \cup (\delta(a) \circ D_a(b)) \circ (a \circ b)^{*} \\ &= \epsilon \circ b \cup \emptyset \circ (a \circ b)^{*} \\ &= b \circ (a \circ b)^*\\ \end{aligned}

\begin{aligned} D_a(L) &amp;= D_a((a \circ b)^{*}) \\ &amp;= D_a(a \circ b) \circ (a \circ b)^{*} \\ &amp;= D_a(a) \circ b \cup (\delta(a) \circ D_a(b)) \circ (a \circ b)^{*} \\ &amp;= \epsilon \circ b \cup \emptyset \circ (a \circ b)^{*} \\ &amp;= b \circ (a \circ b)^*\\ \end{aligned}

check that \(ab\) is in \(L = (ab)^*\)

i.e. \(\epsilon \in \delta(D_{ab}(L))\)

\begin{aligned} D_{ab}(L) &= D_b(D_a(L)) \\ &= (a \circ b)^{*} \\ \end{aligned}

\begin{aligned} D_{ab}(L) &amp;= D_b(D_a(L)) \\ &amp;= (a \circ b)^{*} \\ \end{aligned}

we know that kleene star is nullable, so \(ab\) is accepted

The Brzozowski derivative

on context-free grammars

\(S \rightarrow aSb \ |\ \epsilon\)

Works the same as on regular languages, but is now harder to compute because of recursion

\begin{aligned} D_x(L) &= D_x(L) \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} D_x(L) &amp;= D_x(L) \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

The Brzozowski derivative

on context-free grammars

Step 1: Laziness

\begin{aligned} D_x(L) &= D_x(L) \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} D_x(L) &amp;= D_x(L) \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

Unfold only when needed

The Brzozowski derivative

on context-free grammars

Step 2: Memoization of \(\delta\) in \(D_c\)

\begin{aligned} D_x(L) &= D_x(L) \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} D_x(L) &amp;= D_x(L) \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

Don't repeat work

The Brzozowski derivative

on context-free grammars

Step 3: Calculation of \(\delta\) as a least fixed point

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

\begin{aligned} L = L \circ \{x\} \cup \epsilon \end{aligned}

1. We know that only \(\epsilon\) is nullable

2. for all productions, check for nullable values in the right-hand side without recursing.

If there is at least one, include it in the set of nullable values

3. repeat 2 until \(Nullable_{n} = Nullable_{n + 1}\)

The Brzozowski derivative

on context-free grammars

Step 3: Calculation of \(\delta\) as a least fixed point

\begin{aligned} P &\rightarrow S\\ S &\rightarrow TS \ |\ a\\ T &\rightarrow \epsilon \end{aligned}

\begin{aligned} P &amp;\rightarrow S\\ S &amp;\rightarrow TS \ |\ a\\ T &amp;\rightarrow \epsilon \end{aligned}

\begin{aligned} P &\rightarrow S\\ S &\rightarrow TS \ |\ \epsilon\\ T &\rightarrow \epsilon \end{aligned}

\begin{aligned} P &amp;\rightarrow S\\ S &amp;\rightarrow TS \ |\ \epsilon\\ T &amp;\rightarrow \epsilon \end{aligned}

Building Parse Trees

We've only done validation so far, let's actually parse something

The key insight is that \(D_a(a)\) reduces to \(\epsilon \downarrow \{ a \}\)

And that we can let epsilons in our grammar reduce similarly, e.g.

\(S \rightarrow aSb \ |\ \epsilon \downarrow \{ s \}\)

this gives enough information to retrace our steps later

Note: this is like monadic return or applicative pure

D_{aabb}(S) = D_a(a) \circ (D_a(a) \circ (\epsilon \downarrow \{ s \} \circ (D_b(b) \circ D_b(b))))

D_{aabb}(S) = D_a(a) \circ (D_a(a) \circ (\epsilon \downarrow \{ s \} \circ (D_b(b) \circ D_b(b))))

\begin{aligned} & = \{ a \} \times (D_a(a) \circ (\epsilon \downarrow \{ s \} (D_b(b) \circ D_b(b)))) \\ & = \{ a \} \times ( \{ a \} \times (\epsilon \downarrow \{ s \} (D_b(b) \circ D_b(b)))) \\ & = \{ a \} \times ( \{ a \} \times (\{ s \} \times (D_b(b) \circ D_b(b)))) \\ & = \{ a \} \times ( \{ a \} \times (\{ s \} \times( \{ b \} \times D_b(b))) \\ & = \{ a \} \times ( \{ a \} \times (\{ s \} \times( \{ b \} \times \{ b \})) \\ & = \{ a \} \times ( \{ a \} \times (\{ s \} \times( \{ (b, b) \}))) \\ & = \{ a \} \times ( \{ a \} \times \{ (s, (b, b)) \}) \\ & = \{ a \} \times ( \{ a, (s, (b, b)) \}) \\ & = \{ (a, (a, (s, (b, b)))) \} \\ \end{aligned}

\begin{aligned} &amp; = \{ a \} \times (D_a(a) \circ (\epsilon \downarrow \{ s \} (D_b(b) \circ D_b(b)))) \\ &amp; = \{ a \} \times ( \{ a \} \times (\epsilon \downarrow \{ s \} (D_b(b) \circ D_b(b)))) \\ &amp; = \{ a \} \times ( \{ a \} \times (\{ s \} \times (D_b(b) \circ D_b(b)))) \\ &amp; = \{ a \} \times ( \{ a \} \times (\{ s \} \times( \{ b \} \times D_b(b))) \\ &amp; = \{ a \} \times ( \{ a \} \times (\{ s \} \times( \{ b \} \times \{ b \})) \\ &amp; = \{ a \} \times ( \{ a \} \times (\{ s \} \times( \{ (b, b) \}))) \\ &amp; = \{ a \} \times ( \{ a \} \times \{ (s, (b, b)) \}) \\ &amp; = \{ a \} \times ( \{ a, (s, (b, b)) \}) \\ &amp; = \{ (a, (a, (s, (b, b)))) \} \\ \end{aligned}

D_{aabb}(S) = D_a(a) \circ (D_a(a) \circ (\epsilon \downarrow \{ s \} \circ (D_b(b) \circ D_b(b))))

D_{aabb}(S) = D_a(a) \circ (D_a(a) \circ (\epsilon \downarrow \{ s \} \circ (D_b(b) \circ D_b(b))))

\begin{aligned} & = \{ (a, (a, (s, (b, b)))) \} \\ \end{aligned}

\begin{aligned} &amp; = \{ (a, (a, (s, (b, b)))) \} \\ \end{aligned}

Practicality

Pros

Simple
Short
Quite Fast (with more tricks)

Practicality

Cons

Very difficult to get usable parse output
Hard to design a good API
You can probably do better if you know the specific grammar that you need to parse

(in strongly-typed languages)

Conclusion

Derivatives can be used for parsing
It is really elegant
but needs substantial work to become practical

Parsing with Derivatives

By folkert de vries

Parsing with Derivatives

7 years ago
132

Parsing with Derivatives

Parsing with Derivatives

More from folkert de vries