# On Recursive Data Types

Alexander Konovalov, Compellon Inc

alex.knvl@gmail.com @alexknvl

# The goal

A day-to-day toolkit for working with recursive data types:

• Show the connection between final tagless, "initial encoding", and recursion schemes.
• Fix, Mu, Nu
• Laziness and Graphs
(Traversable + Recursive)
• Derivatives and Indices

## Recursive Data

• Json
• Xml
• Expression trees!
• Real-life abstract syntax trees

But really, it helps everywhere,
because the core message is...

• Different representations
• The idea behind your data matters,
exact runtime repr. matters less.
• Programming language independent
• A great tool for reasoning
data ListOfPairs a b = ListOfPairs [(a, b)]
data PairOfListsOfSameSize = PairOfListsOfSameSize [a] [b]

makeList :: [a] -> [b] -> Maybe PairOfListsOfSameSize

makeList :: (xs :: [a]) -> (ys :: [b]) -> (length xs = length ys) ->
PairOfListsOfSameSize a b

## Isomorphisms

​A and B are isomorphic when there exist two morphisms,

to: A → B and from: B → A, such that both

from (to a) = a

and

to (from b) = b

## Type Isomorphisms

Can't be faithfully implemented except in Dependently typed languages (Idris, Agda, Coq).

trait Iso[A, B] {
def to(a: A): B
def from(b: B): A
def fromTo(x: A): (from(to(x))).type =:= x.type
def toFrom(x: B): (to(from(x))).type =:= x.type
}

Iso を忠実に型で表現するには依存型言語が要る

data Iso a b = {
to   :: a -> b
from :: b -> a
fromTo :: (x :: A) -> from (to x) = x
toFrom :: (x :: b) -> to (from x) = x
}
trait Iso[A, B] {
def to(a: A): B
def from(b: B): A

// make sure that (to andThen from) == id
// and (from andThen to) == id
}

So let's just manually check the necessary laws.

data Iso a b = {
to   :: a -> b
from :: b -> a
-- check from (to x) = x
-- check to (from x) = x
}
type A = Either[Boolean, Unit]

sealed trait B
case object One extends B
case object Two extends B
case object Three extends B

Are these isomorphic?

type A = Either Boolean ()

data B = One | Two | Three
def to(a: A): B = a match {
case Left(true)  => One
case Left(false) => Two
case Right(())   => Three
}

def from(b: B): A = b match {
case One   => Left(true)
case Two   => Left(false)
case Three => Right(())
}

Yes, here is an isomorphism:

data A x y = A x y x

data B x y = B x y y

Are these isomorphic?

They could be isomorphic, if we specialize x and y to ().

But in a polymorphic context, where we do not know either, they are NOT isomorphic.

data A x y = A x x y

data B x y = B y x x

foo :: forall x y. Iso (A x y) (B x y)
boo :: Iso (A Int Bool) (B Int Bool)
boo = foo

Are these isomorphic?

A polymorphic isomorphism implies isomorphism of specializations.

The opposite is generally false.

## Type Algebra

A few more important isomorphisms:

(a, b) ≈ (b, a)
(a, (b, c)) ≈ ((a, b), c)
(a, Either[b, c]) ≈ Either[(a, b), (a, c)]
Either[a, b] ≈ Either[b, a]
Either[Nothing, a] ≈ a
(Unit, a) ≈ a

(a, b) ≈ (b, a)
(a, (b, c)) ≈ ((a, b), c)
(a, Either[b, c]) ≈ Either[(a, b), (a, c)]
Either[a, b] ≈ Either[b, a]
Either[Nothing, a] ≈ a
(Unit, a) ≈ a

What if we replace Either with +, (,) with *, Unit with 1, and Nothing with 0?..

Either を和、タプルを積、Unit を 1、Nothing を 0 と置き換える

(a, b) ≈ (b, a)
(a, (b, c)) ≈ ((a, b), c)
(a, Either[b, c]) ≈ Either[(a, b), (a, c)]
Either[a, b] ≈ Either[b, a]
Either[Nothing, a] ≈ a
(Unit, a) ≈ a

a * b ≈ b * a
a * (b * c) ≈ (a * b) * c
a * (b + c) ≈ a * b + a * c
a + b ≈ b + a
0 + a ≈ a
1 * a ≈ a

Amusingly, we get something very similar to the regular algebra of natural numbers...

(a + b) c ≈ (a c, b c)
a b c ≈ (a, b) c

def curry[A, B, C](f: A => B => C): ((A, B)) => C =
{ case (a, b) => f(a)(b) }

def uncurry[A, B, C](f: ((A, B)) => C): A => B => C =
a => b => f((a, b))

def undiag[A, B, C](f: Either[A, B] => C): (A => C, B => C) =
(a => f(Left(a)), b => f(Right(b)))

def diag[A, B, C](f: (A => C, B => C)): Either[A, B] => C =
{ case Left(a)  => f._1(a)
case Right(b) => f._2(b) }

A couple very important results about functions:

(a + b) c ≈ (a c, b c)
a b c ≈ (a, b) c

c^(a + b) ≈ c^a * c^b
(c^b)^a ≈ c ^ (a * b)

Turns out that similarly to how Either is + and (,) is *, functions correspond to exponentiation:

## Type Algebra

Summing it all up:

Either[a, b] is +

Tuple2[a, b] or (a, b) is A * B

Function1[a, b] or (a → b) is b ^ a

Nothing is 0

Unit is 1

Bool is 2

One more important theorem that we will need.

a ≈ b, then f a ≈ f b for any functor f

Including phantom, covariant, contravariant, or invariant functors.

phantom, covariant など変化球ファンクターなどでも成り立つ

a ≈ b, then f a ≈ f b for any functor f

to' = map to

from' = map from

map to . map from =

map (to . from) = map id = id

## Let's talk recursion

Say we have a data type

data List a = Nil | Cons a (List a)
foldr :: forall a z. z -> (a -> z -> z) -> List a -> z
foldr seed _       Nil              = seed
foldr seed combine (Cons head tail) = head combine (foldr seed combine tail)

and we would like to fold over it

data List a = Nil | Cons a (List a)

foldr :: forall a z. z -> (a -> z -> z) -> List a -> z
foldr seed _       Nil              = seed
foldr seed combine (Cons head tail) = head combine (foldr seed combine tail)

Why do we want fold?

Turns out that every function on lists can be written in terms of foldr!

sum :: List Int -> Int
sum = foldr 0 (+)

map :: List a -> List b
map = foldr Nil (\h, t -> Cons (f h) t)

isEmpty :: List a -> Bool
isEmpty = foldr True (\_, _ -> False)

リストの全ての関数は foldr で表現できる!

Let's look at a slightly more realistic example:

data Expr = Lit Int | Add Expr Expr | Mul Expr Expr

foldr :: forall z. (lit :: Int -> z) ->
(add :: Int -> Int -> z) ->
(mul :: Int -> Int -> z) -> Expr -> z
foldr lit add mul (Lit i)   = lit i
foldr lit add mul (Mul l r) = mul (foldr lit add mul r) (foldr lit add mul r)

We can reinterpret our expressions in different ways:

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))

eval :: Expr -> Int
eval = foldr id (+) (*)

res = eval expr -- 3 * (2 + 3) = 15

print :: Expr -> String
print = foldr show (\l, r -> "(" <> print l <> " + " <> print r <> ")")
(\l, r -> "(" <> print l <> " * " <> print r <> ")")
str = print expr -- "(3 * (2 + 3))"

もう少し現実的な例を見てみよう

\forall z. \text{Expr} \rightarrow (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow z
foldr :: forall z. (lit :: Int -> z) ->
(add :: Int -> Int -> z) ->
(mul :: Int -> Int -> z) -> Expr -> z

Let's look at the type of foldr:

\forall z. \text{Expr} \rightarrow ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z

Let's combine the arguments together:

There are two important concepts here, an algebra, and a pattern/signature functor.

foldrの型について

## Let's talk algebras

In mathematics, and more specifically in abstract algebra, an algebraic structure on a set A (called carrier set or underlying set) is a collection of finitary operations on A; the set A with this structure is also called an algebra.

In mathematics, specifically in category theory, F-algebras generalize algebraic structure. Rewriting the algebraic laws in terms of morphisms eliminates all references to quantified elements from the axioms, and these algebraic laws may then be glued together in terms of a single functor F, the signature.

## Let's talk algebras

\mathfrak{F} c \rightarrow c ~~~ - ~~~ \text{an }\mathfrak{F}\text{-algebra}
\mathfrak{F} - \text{is called a signature, or pattern functor} \\ c - \text{is called a carrier}
type Algebra (f :: * -> *) (c :: *) = f c -> c
foldr :: forall z. (lit :: Int -> z) ->
(add :: Int -> Int -> z) ->
(mul :: Int -> Int -> z) -> Expr -> z
\forall z. \text{Expr} \rightarrow ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z
data ExprF z = LitF Int | AddF z z | MulF z z
\forall z. \text{Expr} \rightarrow (\text{ExprF}~z \rightarrow z) \rightarrow z

Let's define the signature functor.

ファンクターのシグネチャを定義しよう

foldr :: forall z. (lit :: Int -> z) ->
(add :: Int -> Int -> z) ->
(mul :: Int -> Int -> z) -> Expr -> z
foldRS :: forall z. ((Int + (Int, Int) + (Int, Int)) -> z) -> Expr -> z
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS :: forall z. Algebra ExprF z -> Expr -> z

foldRS algebra (Lit i)   = algebra (LitF i)
foldRS algebra (Add l r) = algebra (AddF (foldRS algebra l) (foldRS algebra r))
foldRS algebra (Mul l r) = algebra (MulF (foldRS algebra l) (foldRS algebra r))
data ExprF z = LitF Int | AddF z z | MulF z z
\forall z. \text{Expr} \rightarrow (\text{ExprF}~z \rightarrow z) \rightarrow z

We have arrived at Recursion Schemes.

## Let's talk Recursion Schemes

\begin{aligned} \text{Expr} &= \text{Lit}~\text{Int} | &\text{Add}~ &\text{Expr} &\text{Expr} &| &\text{Mul}~ &\text{Expr} &\text{Expr} \\ \text{ExprF} \bullet &= \text{Lit}~\text{Int} | &\text{Add} &~~~\bullet &\bullet~~~ &| &\text{Mul} &~~~\bullet &\bullet~~~ \end{aligned}

ExprF represents one layer of the Expr tree.

Here's another way to look at signature functors:

project :: Expr -> ExprF Expr
project (Lit i) = (LitF i)
project (Mul x y) = MulF x y

embed :: ExprF Expr -> Expr
embed (LitF i)   = Lit i
embed (MulF x y) = Mul x y
foldRS :: forall z. Algebra ExprF z -> Expr -> z
foldRS algebra (Lit i)   = algebra (LitF i)
foldRS algebra (Add l r) = algebra (AddF (foldRS algebra l) (foldRS algebra r))
foldRS algebra (Mul l r) = algebra (MulF (foldRS algebra l) (foldRS algebra r))

We can use these to rewrite

as

foldRS algebra tree = algebra $fmap (foldRS algebra)$ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

とこうなる

project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $fmap (foldRS algebra)$ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $fmap (foldRS algebra)$ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $fmap (foldRS algebra)$ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $fmap (foldRS algebra)$ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
class Functor f => Recursive t f | t -> f where
project :: t -> f t
cata    :: (f z -> z) -> t -> z

instance Recursive Expr ExprF where
...

We can further generalize this idea of extracting one layer of a recursive data structure at a time.

This is the basis of Recursion Schemes.

ファンクターは再帰的なデータ構造を一度に一つの層に抽出する考えに一般化出来る

expr = Mul (Lit3) (Add (Lit 2) (Lit 3))

evalAlg :: ExprF Int -> Int
evalAlg (LitF i)   = i
evalAlg (AddF l r) = l + r
evalAlg (MulF l r) = l * r

res = foldRS evalAlg expr -- 3 * (2 + 3) = 15

printAlg :: ExprF String -> String
printAlg (LitF i)   = show i
printAlg (AddF l r) = "(" <> l <> " + " <> r <> ")"
printAlg (AddF l r) = "(" <> l <> " * " <> r <> ")"

str = foldRS printAlg expr -- "(3 * (2 + 3))"

We can use the new foldRS to reinterpret expressions the same way we could with foldr.

foldrと同じようにfoldRSを再解釈式に利用できる

Alright, let's go back to our types again:

What if instead of extracting a pattern functor, we extract the algebra?

\forall z. (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow \text{Expr} \rightarrow z
\forall z. ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow \text{Expr} \rightarrow z
\forall z. (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow \text{Expr} \rightarrow z

ファンクターの抽出パターンに置き換えるなら、代数を抽出出来るか

What if instead of extracting a pattern functor, we extract the algebra?

data ExprA z = ExprA {
lit :: Int -> z
add :: z -> z -> z
mul :: z -> z -> z
}
\forall z. (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow \text{Expr} \rightarrow z
∀ z. \text{ExprA}~z \rightarrow \text{Expr} \rightarrow z
foldAlg :: ExprA z -> Expr -> z
foldAlg algebra (Lit i)   = algebra lit i
foldAlg algebra (Add l r) = (algebra add) (foldAlg algebra l) (foldAlg algebra r)
foldAlg algebra (Mul l r) = (algebra mul) (foldAlg algebra l) (foldAlg algebra r)

ファンクターの抽出パターンに置き換えるなら、代数を抽出出来るか

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))

evalAlg :: ExprA Int
evalAlg = ExprA {
lit i = i
add l r = l + r
mul l r = l * r
}
res = foldAlg evalAlg expr -- 3 * (2 + 3) = 15

printAlg :: ExprA String
printAlg = ExprA  {
lit i = i
add l r = "(" <> l <> " + " <> r <> ")"
mul l r = "(" <> l <> " * " <> r <> ")"
}
str = foldAlg printAlg expr -- "(3 * (2 + 3))"

We can use the new foldAlg to reinterpret expressions the same way we could with foldr and foldRS

This is the basis of Final Tagless, except usually ExprA is a typeclass.

foldrやfoldRSと同じようにfoldAlgを再解釈式に利用できる

class ExprC z where
lit :: Int -> z
add :: z -> z -> z
mul :: z -> z -> z

foldFT :: ExprC z => Expr -> z
foldFT (Lit i)   = lit i
foldFT (Mul l r) = mul (foldAlg l) (foldAlg r)

Rewriting foldAlg using a typeclass

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))

instance ExprC Int where
lit i = i
add l r = l + r
mul l r = l * r

res = foldFT expr -- 3 * (2 + 3) = 15

instance ExprC String where
lit i = i
add l r = "(" <> l <> " + " <> r <> ")"
mul l r = "(" <> l <> " * " <> r <> ")"

str = foldFT expr -- "(3 * (2 + 3))"

We can use the new foldFT to reinterpret expressions the same way we could with foldr, foldRS, and foldAlg.

foldrやfoldRS、foldAlgと同じようにfoldFTを再解釈式に利用できる

Let's see all of the derived functions

foldr :: forall z. (lit :: Int -> z) ->
(add :: Int -> Int -> z) ->
(mul :: Int -> Int -> z) -> Expr -> z

foldRS  :: forall z. (ExprF z -> z) -> Expr -> z
foldAlg :: forall z. (ExprA z)      -> Expr -> z
foldFT  :: forall z. (ExprC z)      => Expr -> z

Simply by transforming a function type we have arrived at Recursion Schemes and Final Tagless.

type Algebra f c = f c -> c

data ExprF z  = LitF Int | AddF z z | MulF z z
type ExprA1 Z = Algebra ExprF z

data ExprA z = ExprA {
lit :: Int -> z
add :: z -> z -> z
mul :: z -> z -> z
}

class ExprC z where
lit :: Int -> z
add :: z -> z -> z
mul :: z -> z -> z

These are just different ways to express the same idea: abstracting over recursion.

## One more thing

I mentioned before that every function of a recursive type can be expressed in terms of foldr.

Turns out that every type is isomorphic to its Boehm-Berarducci (also known as Church-) encoding.

Boehm-Berarducciエンコーディングによって全ての型は同一型に表せる

## One more thing

\forall z. \text{Expr} \rightarrow (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow z
\forall z. \text{Expr} \rightarrow ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z
\forall z. \text{Expr} \rightarrow (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow z

This is almost an an isomorphism, we just need to move the quantifier.

ほとんどの同型写像。量化子へ移るのに必要

## One more thing

\text{Expr} \leftrightarrow \forall z. (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow z
\text{Expr} \leftrightarrow \forall z. ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z
\text{Expr} \leftrightarrow \forall z. (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow z

This is an isomorphism!

Boehm-Berarducci encoding

Boehm-Berarducciエンコーディング

data ExprA z = ExprA {
lit :: Int -> z
add :: z -> z -> z
mul :: z -> z -> z
}

data MuExpr = MuExpr (forall z. ExprA z -> z)

to :: Expr -> MuExpr
to expr = MuExpr $\algebra -> foldAlg algebra expr from :: MuExpr -> Expr from (MuExpr run) = run$ ExprA {
lit = Lit
mul = Mul
}

We can explicitly construct the isomorphism:

In mathematics, an initial algebra is an initial object in the category of F-algebras for a given endofunctor F.

We won't go into detail, but in our case, Expr is the initial algebra carrier of ExprA.

The important takeaway is: a recursive type is isomorphic to ∀ z. Alg z → z,

where Alg z = PatF z → z.

ExprはExprAの始対象の集合である

Every type T is isomorphic to the set of all observations we can make about T.

If you know Yoneda Lemma, you should get some pretty interesting ideas right about now!

これらの考えられる一般的な別の方法

type FixExpr = ExprF Expr

project :: Expr -> ExprF Expr
project (Lit i) = (LitF i)
project (Mul x y) = MulF x y

embed :: ExprF Expr -> Expr
embed (LitF i)   = Lit i
embed (MulF x y) = Mul x y

Remember the other isomorphism that we found, Expr ≈ ExprF Expr

## Let's talk Fix, Mu, Nu

type FixExpr = ExprF (ExprF (ExprF ...))
project :: Expr -> FixExpr
embed :: FixExpr -> Expr
type FixExpr = ExprF Expr
project :: Expr -> ExprF Expr
embed :: ExprF Expr -> Expr

If we want to express FixExpr entirely in terms of ExprF, we need to expand Expr recursively:

data Fix f = Fix (f (Fix f))

project :: Expr -> Fix ExprF
project (Lit i) = Fix (LitF i)
project (Mul x y) = Fix (MulF (project x) (project y))

embed :: Fix ExprF -> Expr
embed (Fix (LitF i))   = Lit i
embed (Fix (MulF x y)) = Mul (embed x) (embed y)
type FixExpr = ExprF (ExprF (ExprF ...))
project :: Expr -> FixExpr
embed :: FixExpr -> Expr

Introduce a new data type:

Instead of having a tree with layers consisting of Expr constructors, we have a tree of ExprF constructors.

So an isomorphism Expr ≈ ExprF Expr
led us to Fix.

Let's take a look at another isomorphism we found, Expr ≈ ∀ z. (ExprF z → z) → z

data ExprF z = LitF Int | AddF z z | MulF z z

data MuExpr = MuExpr (forall z. (ExprF z -> z) -> z)

to :: Expr -> MuExpr
to expr = MuExpr $\algebra -> foldRS algebra expr from :: MuExpr -> Expr from (MuExpr run) = run$
\case (LitF i)   -> Lit i
(MulF x y) -> Mul x y

Hmm, MuExpr looks like a pretty general pattern that can be generalized.

Let's take a look at another isomorphism we found, Expr ≈ ∀ z. (ExprF z → z) → z

data Mu f = Mu (forall z. (f z -> z) -> z)

-- data MuExpr = MuExpr (forall z. (ExprF z -> z) -> z)
type MuExpr = Mu Expr

to :: Expr -> MuExpr
to expr = Mu $\algebra -> foldRS algebra expr from :: MuExpr -> Expr from (Mu run) = run$
\case (LitF i)   -> Lit i
(MulF x y) -> Mul x y

= least fixed point

Why least fixed point?

Fixed point of a function is defined as
f z ≈ z

Expr ≈ ∀ z. (ExprF z → z) → z

Mu f ≈ f (Mu f)

f (Mu f) ≈ Mu f

Mu f ≈ f ???

Why least fixed point?

In a total language, we can easily construct
f : ExprF Expr → Expr for Expr with strict data constructors

Expr ≈ ∀ z. (ExprF z → z) → z

However, the equation f (F f) ≈ F f also admits solutions that represent potentially infinite trees

Passing it to Mu, we will get back a tree of type Expr, which must be finite.

## Let's talk infinite data and graphs

In a language with closures and recursive expressions, or a language with lazy data, we can define infinite trees.

data Cfg    = Eps  | Sym Char  | Alt Cfg Cfg | Seq Cfg Cfg
data CfgF z = EpsF | SymF Char | AltF z z    | SeqF z z

Consider a data type for context-free grammars

We can use this type to define a simple context-free grammar:

data Cfg    = Eps  | Sym Char  | Alt Cfg Cfg | Seq Cfg Cfg
data CfgF z = EpsF | SymF Char | AltF z z    | SeqF z z
binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)
binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)

How do we check if a grammar accepts empty strings?

How do we check if a grammar accepts empty strings?

nullable :: Cfg -> Bool
nullable Eps       = True
nullable (Sym c)   = False
nullable (Alt l r) = nullable l || nullable r
nullable (Seq l r) = nullable l && nullable r

nullableAlg :: CfgF Bool -> Bool
nullableAlg EpsF   = True
nullable (Sym c)   = False
nullable (Alt l r) = l || r
nullable (Seq l r) = l && r

How do we check if a grammar accepts empty strings?

binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)

nullable :: Cfg -> Bool
nullable Eps       = True
nullable (Sym c)   = False
nullable (Alt l r) = nullable l || nullable r
nullable (Seq l r) = nullable l && nullable r
binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)

nullable binaryDigit =
nullable (Alt (Sym '0') (Sym '1')) =
(||) (False)   (False) =
False

nullable binaryNumber =
nullable $Alt Eps (Seq binaryDigit binaryNumber) = (||) True ((&&) False (nullable binaryNumber)) How do we check if a grammar accepts empty strings? Suppose we could number our nodes Checking for nullability is still non-trivial, but at least it is decidable for valid grammars! Checking for nullability is still non-trivial, but at least it is decidable for valid grammars! data Graph f = Graph Int [f Int] nullable :: Graph CfgF -> Bool nullable graph = ... to :: Expr -> Graph ExprF from :: Graph ExprF -> Expr So the question is, how do we go in between two representations: to :: Expr -> Graph ExprF from :: Graph ExprF -> Expr So the question is, how do we go in between two representations: to :: Expr -> IO (Graph ExprF) from :: Graph ExprF -> Expr Since observing sharing implies violating referential transparency: to :: Expr -> IO (Graph ExprF) from :: Graph ExprF -> Expr to :: Expr -> IO (Graph ExprF) from :: Graph ExprF -> Expr 1. Traverse nodes in depth first order 2. Collect unique references and number them 3. Project each node into its pattern functor 4. Replace nodes with their indices toGraph :: (Recursive t f, Traversable f) => t -> IO (Graph f) class Corecursive t f | t -> f where embed :: f t -> t ana :: forall z. (z -> f z) -> z -> t fromGraph :: (Corecursive t f, Functor f) => Graph f -> t fromGraph (Graph m) = go 0 where go i = case M.lookup i m of Just fa -> embed$ fmap go fa

Going in the opposite direction is

much easier:

What is this Corecursive typeclass?

class Functor f => Corecursive t f | t -> f where
embed :: f t -> t
ana :: forall z. (z -> f z) -> z -> t

instance Corecursive Cfg CfgF where
embed EpsF       = Eps
embed (SymF c)   = Sym c
embed (AltF x y) = Alt x y
embed (SeqF x y) = Seq x y

ana unfold seed = embed $fmap unfold$ unfold seed
ana unfold = embed . fmap unfold . unfold

What is this Corecursive typeclass?

class Functor f => Corecursive t f | t -> f where
embed :: f t -> t
ana :: forall z. (z -> f z) -> z -> t

instance Corecursive Cfg CfgF where
embed EpsF       = Eps
embed (SymF c)   = Sym c
embed (AltF x y) = Alt x y
embed (SeqF x y) = Seq x y

ana unfold seed = embed $fmap unfold$ unfold seed
ana unfold = embed . fmap unfold . unfold

What is this Corecursive typeclass?

ana  :: forall z. (z -> f z) -> z -> t

You might have noticed a duality between

\forall z. (z \rightarrow f~z) \rightarrow z \rightarrow t
\forall z. (z, z \rightarrow f~z) \rightarrow t
(\exists z. (z, z \rightarrow f~z)) \rightarrow t

It turns out that this is an isomorphism if our type is lazy.

(\exists z. (z, z \rightarrow f~z)) \leftrightarrow t
(\exists z. (z, z \rightarrow f~z)) \leftrightarrow t
data Nu f = forall z. Nu z (z -> f z)

to :: Cfg -> Nu CfgF
to expr = Nu expr (\t -> project t)

from :: Nu CfgF -> Cfg
from (Nu seed unfold) = go s where
go s = embed $fmap go$ unfold s
go   = embed . fmap go . unfold

We can show that it is an isomorphism by constructing the to and from functions

## Let's talk indexing

data Tree = Lit Int | Arr [Tree]
data TreeF t = Lit Int | Arr [t]

Consider this simplified version of Json

tree :: Tree
tree = Arr [
Arr [
Lit 3,
Lit 4
],
Lit 5
]

Suppose I want to identify a particular element within that tree...

Here's one possible value of this type:

data Tree = Lit Int | Arr [Tree]
data TreeF t = LitF Int | ArrF [t]

tree :: Tree
tree = Arr [
Arr [
Lit 3,
Lit 4
],
Lit 5
]

Suppose I want to identify a particular element within that tree:

data Index = Index [Int]

-- partial!
index :: Index -> Tree -> Int
index (Index []) (Lit i) = i
index (Index (x : xs)) (Arr list) = index (Index xs) (list !! x)

We can represent indices into our data using:

data Index = Index [Int]

-- partial!
index :: Index -> Tree -> Tree
index (Index [])        tree      = tree
index (Index (x : xs)) (Arr list) = index (Index xs) (list !! x)

A bit more general, we can return a node instead of assuming that it must be a literal

data IndexF = ...?

data TreeIndex = TreeIndex [IndexF]

-- partial!
index :: Index -> Tree -> Tree
index (Index [])        tree = tree
index (Index (x : xs))  tree =
index (Index xs) subtree
where subtree = indexF x $project list How can we generalize this pattern? Now remember the Tree ≈ TreeF Tree isomorphism data Tree = Lit Int | Arr Tree Tree data TreeF t = LitF Int | ArrF t t Let's first simplify our structure a bit: data IndexF = ...? data TreeIndex = TreeIndex [IndexF] -- partial! index :: Index -> Tree -> Tree index (Index []) tree = tree index (Index (x : xs)) tree = index (Index xs) subtree where subtree = indexF x$ project list

How do we find IndexF?

data TreeF t = Lit Int | Arr t t
data IndexF  = ...?

IndexF represents a location of a t inside of TreeF t

data TreeF t = Lit Int | Arr t t
data IndexF  = Arr1 | Arr2

indexF :: IndexF -> TreeF z -> z
indexF Arr1 (Arr x _) = x
indexF Arr2 (Arr _ x) = x

How do we deal with errors?

data TreeF t     = Lit Int | Arr t t
data IndexF      = Arr1 | Arr2
data IndexErrorF = FoundLit

indexF :: IndexF -> TreeF z -> Either IndexErrorF z
indexF _    (Lit i)   = Left FoundLit
indexF Arr1 (Arr x _) = Right x
indexF Arr2 (Arr _ x) = Right x

index :: [IndexF] -> Tree -> Either ([IndexF], IndexErrorF) z
index = go [] where
go path []       t = t
go path (x : xs) t =
case (indexF x \$ project t) of
Left (path1, e) -> Left (path ++ path1, e)
Right z         -> Right z



Can we mechanically derive IndexF from TreeF?

data TreeF t     = Lit Int | Arr t t
data IndexF      = Arr1 | Arr2
data IndexErrorF = FoundLit

Yes!

i[ (f x) * (g x) ] = i[f x] + i[g x]

i[ (f x) + (g x) ] = i[f x] + i[g x]

i[ x ] = 1

i[ y ] = 0 where x != y

i[ TreeF x ] = i[ Int + x * x ] = i[Int] + i[x * x] = 0 + 1 + 1 = 2

Can we mechanically derive IndexErrorF from TreeF?

data ExprF t     = Lit Int | Add t t | Mul t t
data IndexErrorF = LitNotAdd | LitNotMul | AddNotMul | MulNotAdd

Yes. The question is how to define it in a nice algebraic form.

## Let's talk derivatives

data List a = [] | (:) a (List a)

list = [1, 2, 3, 4, 5, 6]

data Zipper a = NonEmpty [a] a [a]

start (x : xs) = NonEmpty [] x xs

left  (NonEmpty (x : xs) c ys) = NonEmpty  xs      x (c : ys)

right (NonEmpty xs c (y : ys)) = NonEmpty (c : xs) y (y : ys)

current (NonEmpty xs c ys)     = c

replace (NonEmpty xs _ ys) n   = NonEmpty xs n ys


How do we generalize this structure to arbitrary recursive data type?

How do we generalize this structure to arbitrary recursive data type?

Conceptually, zippers (a.k.a. derivatives) select a single value from a data structure within a certain context

(x, df x) f x

If we have a tuple, (x, x), there are two ways to select an element of type x:

f x = (x, x)

df x = SelectFirst (second :: x) | SelectSecond (first :: x)

df x = Either x x

df x = x + x

Conceptually, zippers (a.k.a. derivatives) select a single value from a data structure within a certain context

(x, df x) f x

If we have a tuple, (f1 x, f2 x), then:

f x = (f1 x, f2 x)

df x = (df1 x, f2 x) + (f1 x, df2 x)

If we have an either, (f1 x) + (f2 x), then:

f x = (f1 x) + (f2 x)

df x = (df1 x) + (df2 x)

Similar to our derivation of indices, we define derivatives purely algebraically

Δ[ (f x) * (g x) ] = Δ[f x] * (g x) + (f x) * Δ[g x]

Δ[ (f x) + (g x) ] = Δ[f x] + d[g x]

Δ[ x ] = 1

Δ[ y ] = 0 where y != x

data List x = [] | (:) x (List x)

Δ[ (f x) * (g x) ] = Δ[f x] * (g x) + (f x) * Δ[g x]

Δ[ (f x) + (g x) ] = Δ[f x] + Δ[g x]

Δ[ x ] = 1

Δ[ y ] = 0 where x != y

Δ[List x] = Δ[1 + x * (List x)] =
=1 * (List x) + x *  Δ[List x]

Δ[List x] (1 - x) = List x

Δ[List x] = List x / (1 - x)

Δ[List x] (1 - x) = List x

Δ[List x] = List x / (1 - x)

List x = 1 + x * (List x)

(List x) * (1 - x) = 1

List x = 1/(1-x)

Δ[List x] = List x * List x

data Zipper a = NonEmpty [a] a [a]

(x, Δ[List x]) = (x, List x * List x)

## But why do we care?

Type derivatives are amazing, but what I am interested in is how they interact with pattern functors

class (Functor f, Functor d) => Recursive t f d | t -> f, t -> d where
project :: t -> f t
projectD :: t -> f (d t, t)
cata    :: (f z -> z) -> t -> z

cataH :: ([d t] -> f (z, t) -> z) -> [d t] -> t -> z
cataH alg = go where
go hist = alg hist . fmap (\(d, e) -> (go (d : hist) e, e)) . projectD

## Summing it all up

folds ≈ Recursion Schemes ≈ Final Tagless (simple algebras)

Type algebra

まとめ

## Summing it all up

Isomorphism to the Initial algebra carrier

Isomorphism to the product of all observable properties

まとめ

## Summing it all up

Fix, Mu, Nu

Graph representation

まとめ

Indexing

Derivatives

## Q&A

Alexander Konovalov, Compellon Inc

alex.knvl@gmail.com @alexknvl

alexknvl.com

#### On Recursive Data Types

By Alexander Konovalov

• 349