On Recursive Data Types

Alexander Konovalov, Compellon Inc

alex.knvl@gmail.com @alexknvl

The goal

A day-to-day toolkit for working with recursive data types:

  • Show the connection between final tagless, "initial encoding", and recursion schemes.
  • Fix, Mu, Nu
  • Laziness and Graphs
    (Traversable + Recursive)
  • Derivatives and Indices

Recursive Data

  • Json
  • Xml
  • Expression trees!
  • Real-life abstract syntax trees

 

But really, it helps everywhere,
because the core message is...

Thinking about shapes

  • Different representations
  • The idea behind your data matters,
    exact runtime repr. matters less.
  • Programming language independent
  • A great tool for reasoning
data ListOfPairs a b = ListOfPairs [(a, b)]
data PairOfListsOfSameSize = PairOfListsOfSameSize [a] [b]

makeList :: [a] -> [b] -> Maybe PairOfListsOfSameSize

makeList :: (xs :: [a]) -> (ys :: [b]) -> (length xs = length ys) ->
            PairOfListsOfSameSize a b

Isomorphisms

​A and B are isomorphic when there exist two morphisms,

to: A → B and from: B → A, such that both

 

from (to a) = a

and

to (from b) = b

Type Isomorphisms

Can't be faithfully implemented except in Dependently typed languages (Idris, Agda, Coq).

trait Iso[A, B] {
  def to(a: A): B
  def from(b: B): A
  def fromTo(x: A): (from(to(x))).type =:= x.type
  def toFrom(x: B): (to(from(x))).type =:= x.type
}

Iso を忠実に型で表現するには依存型言語が要る

data Iso a b = {
  to   :: a -> b
  from :: b -> a
  fromTo :: (x :: A) -> from (to x) = x
  toFrom :: (x :: b) -> to (from x) = x
}
trait Iso[A, B] {
  def to(a: A): B
  def from(b: B): A
 
  // make sure that (to andThen from) == id
  // and (from andThen to) == id
}

So let's just manually check the necessary laws.

仕方が無いので手動で公理をチェックする

data Iso a b = {
  to   :: a -> b
  from :: b -> a
  -- check from (to x) = x
  -- check to (from x) = x
}
type A = Either[Boolean, Unit]

sealed trait B
case object One extends B
case object Two extends B
case object Three extends B

Are these isomorphic?

同型写像なのか?

type A = Either Boolean ()

data B = One | Two | Three
def to(a: A): B = a match {
  case Left(true)  => One
  case Left(false) => Two
  case Right(())   => Three
}

def from(b: B): A = b match {
  case One   => Left(true)
  case Two   => Left(false)
  case Three => Right(())
}

Yes, here is an isomorphism:

同型写像だ

data A x y = A x y x

data B x y = B x y y

Are these isomorphic?

They could be isomorphic, if we specialize x and y to ().

But in a polymorphic context, where we do not know either, they are NOT isomorphic.

data A x y = A x x y

data B x y = B y x x

foo :: forall x y. Iso (A x y) (B x y)
boo :: Iso (A Int Bool) (B Int Bool)
boo = foo

Are these isomorphic?

A polymorphic isomorphism implies isomorphism of specializations.

The opposite is generally false.

Type Algebra

A few more important isomorphisms:

 

(a, b) ≈ (b, a)
(a, (b, c)) ≈ ((a, b), c)
(a, Either[b, c]) ≈ Either[(a, b), (a, c)]
Either[a, b] ≈ Either[b, a]
Either[Nothing, a] ≈ a
(Unit, a) ≈ a

重要な型同型

(a, b) ≈ (b, a)
(a, (b, c)) ≈ ((a, b), c)
(a, Either[b, c]) ≈ Either[(a, b), (a, c)]
Either[a, b] ≈ Either[b, a]
Either[Nothing, a] ≈ a
(Unit, a) ≈ a

What if we replace Either with +, (,) with *, Unit with 1, and Nothing with 0?..

Either を和、タプルを積、Unit を 1、Nothing を 0 と置き換える

(a, b) ≈ (b, a)
(a, (b, c)) ≈ ((a, b), c)
(a, Either[b, c]) ≈ Either[(a, b), (a, c)]
Either[a, b] ≈ Either[b, a]
Either[Nothing, a] ≈ a
(Unit, a) ≈ a

a * b ≈ b * a
a * (b * c) ≈ (a * b) * c
a * (b + c) ≈ a * b + a * c
a + b ≈ b + a
0 + a ≈ a
1 * a ≈ a

Amusingly, we get something very similar to the regular algebra of natural numbers...

面白いことに、自然数の代数系と似通っている

(a + b) c ≈ (a c, b c)
a b c ≈ (a, b) c

def curry[A, B, C](f: A => B => C): ((A, B)) => C =
    { case (a, b) => f(a)(b) }

def uncurry[A, B, C](f: ((A, B)) => C): A => B => C =
    a => b => f((a, b))

def undiag[A, B, C](f: Either[A, B] => C): (A => C, B => C) =
    (a => f(Left(a)), b => f(Right(b)))

def diag[A, B, C](f: (A => C, B => C)): Either[A, B] => C =
    { case Left(a)  => f._1(a)
      case Right(b) => f._2(b) }

A couple very important results about functions:

関数の重要な結果

(a + b) c ≈ (a c, b c)
a b c ≈ (a, b) c

c^(a + b) ≈ c^a * c^b
(c^b)^a ≈ c ^ (a * b)

Turns out that similarly to how Either is + and (,) is *, functions correspond to exponentiation:

関数は指数に対応する

Type Algebra

Summing it all up:

Either[a, b] is +

Tuple2[a, b] or (a, b) is A * B

Function1[a, b] or (a → b) is b ^ a

Nothing is 0

Unit is 1

Bool is 2

型代数のまとめ

One more important theorem that we will need.

a ≈ b, then f a ≈ f b for any functor f

Including phantom, covariant, contravariant, or invariant functors.

必要であろうもう一つの重要な定理

全てのファンクターについて、a ≈ b のとき f a ≈ f b が成り立つ

phantom, covariant など変化球ファンクターなどでも成り立つ

a ≈ b, then f a ≈ f b for any functor f

to' = map to

from' = map from

map to . map from =

map (to . from) = map id = id

Let's talk recursion

Say we have a data type

data List a = Nil | Cons a (List a)
foldr :: forall a z. z -> (a -> z -> z) -> List a -> z
foldr seed _       Nil              = seed
foldr seed combine (Cons head tail) = head `combine` (foldr seed combine tail)

and we would like to fold over it

再帰について話そう

data List a = Nil | Cons a (List a)

foldr :: forall a z. z -> (a -> z -> z) -> List a -> z
foldr seed _       Nil              = seed
foldr seed combine (Cons head tail) = head `combine` (foldr seed combine tail)

Why do we want fold?

Turns out that every function on lists can be written in terms of foldr!

sum :: List Int -> Int
sum = foldr 0 (+)

map :: List a -> List b
map = foldr Nil (\h, t -> Cons (f h) t)

isEmpty :: List a -> Bool
isEmpty = foldr True (\_, _ -> False)

リストの全ての関数は foldr で表現できる!

Let's look at a slightly more realistic example:

data Expr = Lit Int | Add Expr Expr | Mul Expr Expr

foldr :: forall z. (lit :: Int -> z) -> 
                   (add :: Int -> Int -> z) -> 
                   (mul :: Int -> Int -> z) -> Expr -> z
foldr lit add mul (Lit i)   = lit i
foldr lit add mul (Add l r) = add (foldr lit add mul l) (foldr lit add mul r)
foldr lit add mul (Mul l r) = mul (foldr lit add mul r) (foldr lit add mul r)

We can reinterpret our expressions in different ways:

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))

eval :: Expr -> Int
eval = foldr id (+) (*)

res = eval expr -- 3 * (2 + 3) = 15

print :: Expr -> String
print = foldr show (\l, r -> "(" <> print l <> " + " <> print r <> ")")
                   (\l, r -> "(" <> print l <> " * " <> print r <> ")")
str = print expr -- "(3 * (2 + 3))"

もう少し現実的な例を見てみよう

\forall z. \text{Expr} \rightarrow (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow z
foldr :: forall z. (lit :: Int -> z) -> 
                   (add :: Int -> Int -> z) -> 
                   (mul :: Int -> Int -> z) -> Expr -> z

Let's look at the type of foldr:

\forall z. \text{Expr} \rightarrow ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z

Let's combine the arguments together:

There are two important concepts here, an algebra, and a pattern/signature functor.

foldrの型について

Let's talk algebras

In mathematics, and more specifically in abstract algebra, an algebraic structure on a set A (called carrier set or underlying set) is a collection of finitary operations on A; the set A with this structure is also called an algebra.

代数について話そう

In mathematics, specifically in category theory, F-algebras generalize algebraic structure. Rewriting the algebraic laws in terms of morphisms eliminates all references to quantified elements from the axioms, and these algebraic laws may then be glued together in terms of a single functor F, the signature.

Let's talk algebras

代数について話そう

\mathfrak{F} c \rightarrow c ~~~ - ~~~ \text{an }\mathfrak{F}\text{-algebra}
\mathfrak{F} - \text{is called a signature, or pattern functor} \\ c - \text{is called a carrier}
type Algebra (f :: * -> *) (c :: *) = f c -> c
foldr :: forall z. (lit :: Int -> z) -> 
                   (add :: Int -> Int -> z) -> 
                   (mul :: Int -> Int -> z) -> Expr -> z
\forall z. \text{Expr} \rightarrow ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z
data ExprF z = LitF Int | AddF z z | MulF z z
\forall z. \text{Expr} \rightarrow (\text{ExprF}~z \rightarrow z) \rightarrow z

Let's define the signature functor.

ファンクターのシグネチャを定義しよう

foldr :: forall z. (lit :: Int -> z) -> 
                   (add :: Int -> Int -> z) -> 
                   (mul :: Int -> Int -> z) -> Expr -> z
foldRS :: forall z. ((Int + (Int, Int) + (Int, Int)) -> z) -> Expr -> z
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS :: forall z. Algebra ExprF z -> Expr -> z

foldRS algebra (Lit i)   = algebra (LitF i)
foldRS algebra (Add l r) = algebra (AddF (foldRS algebra l) (foldRS algebra r))
foldRS algebra (Mul l r) = algebra (MulF (foldRS algebra l) (foldRS algebra r))
data ExprF z = LitF Int | AddF z z | MulF z z
\forall z. \text{Expr} \rightarrow (\text{ExprF}~z \rightarrow z) \rightarrow z

We have arrived at Recursion Schemes.

再帰スキームに到達した

Let's talk Recursion Schemes

\begin{aligned} \text{Expr} &= \text{Lit}~\text{Int} | &\text{Add}~ &\text{Expr} &\text{Expr} &| &\text{Mul}~ &\text{Expr} &\text{Expr} \\ \text{ExprF} \bullet &= \text{Lit}~\text{Int} | &\text{Add} &~~~\bullet &\bullet~~~ &| &\text{Mul} &~~~\bullet &\bullet~~~ \end{aligned}

ExprF represents one layer of the Expr tree.

Here's another way to look at signature functors:

再帰スキームについて話そう

project :: Expr -> ExprF Expr 
project (Lit i) = (LitF i)
project (Add x y) = AddF x y
project (Mul x y) = MulF x y

embed :: ExprF Expr -> Expr
embed (LitF i)   = Lit i
embed (AddF x y) = Add x y
embed (MulF x y) = Mul x y
foldRS :: forall z. Algebra ExprF z -> Expr -> z
foldRS algebra (Lit i)   = algebra (LitF i)
foldRS algebra (Add l r) = algebra (AddF (foldRS algebra l) (foldRS algebra r))
foldRS algebra (Mul l r) = algebra (MulF (foldRS algebra l) (foldRS algebra r))

We can use these to rewrite

as

foldRS algebra tree = algebra $ fmap (foldRS algebra) $ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

書き直す

とこうなる

project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $ fmap (foldRS algebra) $ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $ fmap (foldRS algebra) $ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $ fmap (foldRS algebra) $ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
project :: Expr -> ExprF Expr
foldRS :: forall z. (ExprF z -> z) -> Expr -> z
foldRS algebra tree = algebra $ fmap (foldRS algebra) $ project tree
foldRS algebra      = algebra . fmap (foldRS algebra) . project

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))
class Functor f => Recursive t f | t -> f where
  project :: t -> f t
  cata    :: (f z -> z) -> t -> z

instance Recursive Expr ExprF where
  ...

We can further generalize this idea of extracting one layer of a recursive data structure at a time.

This is the basis of Recursion Schemes.

ファンクターは再帰的なデータ構造を一度に一つの層に抽出する考えに一般化出来る

再帰スキームの基本

expr = Mul (Lit3) (Add (Lit 2) (Lit 3))

evalAlg :: ExprF Int -> Int
evalAlg (LitF i)   = i
evalAlg (AddF l r) = l + r
evalAlg (MulF l r) = l * r

res = foldRS evalAlg expr -- 3 * (2 + 3) = 15

printAlg :: ExprF String -> String
printAlg (LitF i)   = show i
printAlg (AddF l r) = "(" <> l <> " + " <> r <> ")"
printAlg (AddF l r) = "(" <> l <> " * " <> r <> ")"

str = foldRS printAlg expr -- "(3 * (2 + 3))"

We can use the new foldRS to reinterpret expressions the same way we could with foldr.

foldrと同じようにfoldRSを再解釈式に利用できる

Alright, let's go back to our types again:

What if instead of extracting a pattern functor, we extract the algebra?

\forall z. (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow \text{Expr} \rightarrow z
\forall z. ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow \text{Expr} \rightarrow z
\forall z. (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow \text{Expr} \rightarrow z

戻ってみよう

ファンクターの抽出パターンに置き換えるなら、代数を抽出出来るか

What if instead of extracting a pattern functor, we extract the algebra?

data ExprA z = ExprA {
  lit :: Int -> z
  add :: z -> z -> z
  mul :: z -> z -> z
}
\forall z. (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow \text{Expr} \rightarrow z
∀ z. \text{ExprA}~z \rightarrow \text{Expr} \rightarrow z
foldAlg :: ExprA z -> Expr -> z
foldAlg algebra (Lit i)   = algebra `lit` i
foldAlg algebra (Add l r) = (algebra `add`) (foldAlg algebra l) (foldAlg algebra r)
foldAlg algebra (Mul l r) = (algebra `mul`) (foldAlg algebra l) (foldAlg algebra r)

ファンクターの抽出パターンに置き換えるなら、代数を抽出出来るか

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))

evalAlg :: ExprA Int
evalAlg = ExprA {
  lit i = i
  add l r = l + r
  mul l r = l * r
}
res = foldAlg evalAlg expr -- 3 * (2 + 3) = 15

printAlg :: ExprA String 
printAlg = ExprA  {
  lit i = i
  add l r = "(" <> l <> " + " <> r <> ")"
  mul l r = "(" <> l <> " * " <> r <> ")"
}
str = foldAlg printAlg expr -- "(3 * (2 + 3))"

We can use the new foldAlg to reinterpret expressions the same way we could with foldr and foldRS

This is the basis of Final Tagless, except usually ExprA is a typeclass.

foldrやfoldRSと同じようにfoldAlgを再解釈式に利用できる

class ExprC z where
  lit :: Int -> z
  add :: z -> z -> z
  mul :: z -> z -> z
  
foldFT :: ExprC z => Expr -> z
foldFT (Lit i)   = lit i
foldFT (Add l r) = add (foldAlg l) (foldAlg r)
foldFT (Mul l r) = mul (foldAlg l) (foldAlg r)

Rewriting foldAlg using a typeclass

型クラスを使ってfoldAlgに書き直す

expr = Mul (Lit 3) (Add (Lit 2) (Lit 3))

instance ExprC Int where
  lit i = i
  add l r = l + r
  mul l r = l * r
  
res = foldFT expr -- 3 * (2 + 3) = 15

instance ExprC String where
  lit i = i
  add l r = "(" <> l <> " + " <> r <> ")"
  mul l r = "(" <> l <> " * " <> r <> ")"

str = foldFT expr -- "(3 * (2 + 3))"

We can use the new foldFT to reinterpret expressions the same way we could with foldr, foldRS, and foldAlg.

foldrやfoldRS、foldAlgと同じようにfoldFTを再解釈式に利用できる

Let's see all of the derived functions

foldr :: forall z. (lit :: Int -> z) -> 
                   (add :: Int -> Int -> z) -> 
                   (mul :: Int -> Int -> z) -> Expr -> z

foldRS  :: forall z. (ExprF z -> z) -> Expr -> z
foldAlg :: forall z. (ExprA z)      -> Expr -> z
foldFT  :: forall z. (ExprC z)      => Expr -> z

Simply by transforming a function type we have arrived at Recursion Schemes and Final Tagless.

派生した関数を見てみよう

type Algebra f c = f c -> c

data ExprF z  = LitF Int | AddF z z | MulF z z
type ExprA1 Z = Algebra ExprF z

data ExprA z = ExprA {
  lit :: Int -> z
  add :: z -> z -> z
  mul :: z -> z -> z
}

class ExprC z where
  lit :: Int -> z
  add :: z -> z -> z
  mul :: z -> z -> z

These are just different ways to express the same idea: abstracting over recursion.

再帰を超えた抽象化

One more thing

I mentioned before that every function of a recursive type can be expressed in terms of foldr.

Turns out that every type is isomorphic to its Boehm-Berarducci (also known as Church-) encoding.

再帰型の全ての関数はfoldrの用語で表せる

Boehm-Berarducciエンコーディングによって全ての型は同一型に表せる

One more thing

\forall z. \text{Expr} \rightarrow (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow z
\forall z. \text{Expr} \rightarrow ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z
\forall z. \text{Expr} \rightarrow (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow z

This is almost an an isomorphism, we just need to move the quantifier.

ほとんどの同型写像。量化子へ移るのに必要

One more thing

\text{Expr} \leftrightarrow \forall z. (Int \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow ((z, z) \rightarrow z) \rightarrow z
\text{Expr} \leftrightarrow \forall z. ((Int + (z, z) + (z, z)) \rightarrow z) \rightarrow z
\text{Expr} \leftrightarrow \forall z. (Int \rightarrow z, (z, z) \rightarrow z, (z, z) \rightarrow z) \rightarrow z

This is an isomorphism!

Boehm-Berarducci encoding

Boehm-Berarducciエンコーディング

data ExprA z = ExprA {
  lit :: Int -> z
  add :: z -> z -> z
  mul :: z -> z -> z
}

data MuExpr = MuExpr (forall z. ExprA z -> z)

to :: Expr -> MuExpr 
to expr = MuExpr $ \algebra -> foldAlg algebra expr

from :: MuExpr -> Expr
from (MuExpr run) = run $ ExprA {
  lit = Lit
  add = Add
  mul = Mul
}

We can explicitly construct the isomorphism:

同型写像を明示的に構築

In mathematics, an initial algebra is an initial object in the category of F-algebras for a given endofunctor F.

We won't go into detail, but in our case, Expr is the initial algebra carrier of ExprA.

The important takeaway is: a recursive type is isomorphic to ∀ z. Alg z → z,

where Alg z = PatF z → z.

数学では、始代数は、与えられた自己関手Fに対するF-代数の圏における始対象である

ExprはExprAの始対象の集合である

重要な取り除き

Another, more general way of thinking about this:

Every type T is isomorphic to the set of all observations we can make about T.

If you know Yoneda Lemma, you should get some pretty interesting ideas right about now!

これらの考えられる一般的な別の方法

type FixExpr = ExprF Expr

project :: Expr -> ExprF Expr 
project (Lit i) = (LitF i)
project (Add x y) = AddF x y
project (Mul x y) = MulF x y

embed :: ExprF Expr -> Expr
embed (LitF i)   = Lit i
embed (AddF x y) = Add x y
embed (MulF x y) = Mul x y

Remember the other isomorphism that we found, Expr ≈ ExprF Expr

Let's talk Fix, Mu, Nu

type FixExpr = ExprF (ExprF (ExprF ...))
project :: Expr -> FixExpr
embed :: FixExpr -> Expr
type FixExpr = ExprF Expr
project :: Expr -> ExprF Expr
embed :: ExprF Expr -> Expr

If we want to express FixExpr entirely in terms of ExprF, we need to expand Expr recursively:

data Fix f = Fix (f (Fix f))

project :: Expr -> Fix ExprF 
project (Lit i) = Fix (LitF i)
project (Add x y) = Fix (AddF (project x) (project y))
project (Mul x y) = Fix (MulF (project x) (project y))

embed :: Fix ExprF -> Expr
embed (Fix (LitF i))   = Lit i
embed (Fix (AddF x y)) = Add (embed x) (embed y)
embed (Fix (MulF x y)) = Mul (embed x) (embed y)
type FixExpr = ExprF (ExprF (ExprF ...))
project :: Expr -> FixExpr
embed :: FixExpr -> Expr

Introduce a new data type:

Instead of having a tree with layers consisting of Expr constructors, we have a tree of ExprF constructors.

So an isomorphism Expr ≈ ExprF Expr
led us to Fix.

Let's take a look at another isomorphism we found, Expr ≈ ∀ z. (ExprF z → z) → z

data ExprF z = LitF Int | AddF z z | MulF z z

data MuExpr = MuExpr (forall z. (ExprF z -> z) -> z)

to :: Expr -> MuExpr 
to expr = MuExpr $ \algebra -> foldRS algebra expr

from :: MuExpr -> Expr
from (MuExpr run) = run $ 
    \case (LitF i)   -> Lit i
          (AddF x y) -> Add x y
          (MulF x y) -> Mul x y

Hmm, MuExpr looks like a pretty general pattern that can be generalized.

Let's take a look at another isomorphism we found, Expr ≈ ∀ z. (ExprF z → z) → z

data Mu f = Mu (forall z. (f z -> z) -> z)

-- data MuExpr = MuExpr (forall z. (ExprF z -> z) -> z)
type MuExpr = Mu Expr

to :: Expr -> MuExpr 
to expr = Mu $ \algebra -> foldRS algebra expr

from :: MuExpr -> Expr
from (Mu run) = run $ 
    \case (LitF i)   -> Lit i
          (AddF x y) -> Add x y
          (MulF x y) -> Mul x y

Mu = Boehm-Beraducci encoding
= least fixed point

Why least fixed point?

Fixed point of a function is defined as
f z ≈ z

Expr ≈ ∀ z. (ExprF z → z) → z

Mu f ≈ f (Mu f)

f (Mu f) ≈ Mu f

Mu f ≈ f ???

Why least fixed point?

In a total language, we can easily construct
f : ExprF Expr → Expr for Expr with strict data constructors

Expr ≈ ∀ z. (ExprF z → z) → z

However, the equation f (F f) ≈ F f also admits solutions that represent potentially infinite trees

Passing it to Mu, we will get back a tree of type Expr, which must be finite.

Let's talk infinite data and graphs

In a language with closures and recursive expressions, or a language with lazy data, we can define infinite trees.

data Cfg    = Eps  | Sym Char  | Alt Cfg Cfg | Seq Cfg Cfg
data CfgF z = EpsF | SymF Char | AltF z z    | SeqF z z

Consider a data type for context-free grammars

We can use this type to define a simple context-free grammar:

data Cfg    = Eps  | Sym Char  | Alt Cfg Cfg | Seq Cfg Cfg
data CfgF z = EpsF | SymF Char | AltF z z    | SeqF z z
binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)
binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)

How do we check if a grammar accepts empty strings?

How do we check if a grammar accepts empty strings?

nullable :: Cfg -> Bool
nullable Eps       = True
nullable (Sym c)   = False
nullable (Alt l r) = nullable l || nullable r
nullable (Seq l r) = nullable l && nullable r

nullableAlg :: CfgF Bool -> Bool
nullableAlg EpsF   = True
nullable (Sym c)   = False
nullable (Alt l r) = l || r
nullable (Seq l r) = l && r

How do we check if a grammar accepts empty strings?

binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)

nullable :: Cfg -> Bool
nullable Eps       = True
nullable (Sym c)   = False
nullable (Alt l r) = nullable l || nullable r
nullable (Seq l r) = nullable l && nullable r
binaryDigit  = Alt (Sym '0') (Sym '1')
binaryNumber = Alt Eps (Seq binaryDigit binaryNumber)

nullable binaryDigit = 
	nullable (Alt (Sym '0') (Sym '1')) =
	         (||) (False)   (False) =
                        False
                        

nullable binaryNumber = 
	nullable $ Alt Eps (Seq binaryDigit binaryNumber) =
               (||) True ((&&) False (nullable binaryNumber))

How do we check if a grammar accepts empty strings?

Suppose we could number our nodes

Checking for nullability is still non-trivial, but at least it is decidable for valid grammars!

Checking for nullability is still non-trivial, but at least it is decidable for valid grammars!

data Graph f = Graph Int [f Int]

nullable :: Graph CfgF -> Bool
nullable graph = ...
to :: Expr -> Graph ExprF
from :: Graph ExprF -> Expr

So the question is, how do we go in between two representations:

to :: Expr -> Graph ExprF
from :: Graph ExprF -> Expr

So the question is, how do we go in between two representations:

to :: Expr -> IO (Graph ExprF)
from :: Graph ExprF -> Expr

Since observing sharing implies violating referential transparency:

to :: Expr -> IO (Graph ExprF)
from :: Graph ExprF -> Expr
to :: Expr -> IO (Graph ExprF)
from :: Graph ExprF -> Expr

1. Traverse nodes in depth first order

2. Collect unique references and number them

3. Project each node into its pattern functor

4. Replace nodes with their indices

toGraph   :: (Recursive t f, Traversable f) => t -> IO (Graph f)
class Corecursive t f | t -> f where
  embed :: f t -> t
  ana :: forall z. (z -> f z) -> z -> t

fromGraph :: (Corecursive t f, Functor f) => Graph f -> t
fromGraph (Graph m) = go 0 where
    go i = case M.lookup i m of
                Just fa -> embed $ fmap go fa

Going in the opposite direction is

much easier:

What is this Corecursive typeclass?

class Functor f => Corecursive t f | t -> f where
  embed :: f t -> t
  ana :: forall z. (z -> f z) -> z -> t
  
instance Corecursive Cfg CfgF where
  embed EpsF       = Eps
  embed (SymF c)   = Sym c
  embed (AltF x y) = Alt x y
  embed (SeqF x y) = Seq x y
  
  ana unfold seed = embed $ fmap unfold $ unfold seed
  ana unfold = embed . fmap unfold . unfold

What is this Corecursive typeclass?

class Functor f => Corecursive t f | t -> f where
  embed :: f t -> t
  ana :: forall z. (z -> f z) -> z -> t
  
instance Corecursive Cfg CfgF where
  embed EpsF       = Eps
  embed (SymF c)   = Sym c
  embed (AltF x y) = Alt x y
  embed (SeqF x y) = Seq x y
  
  ana unfold seed = embed $ fmap unfold $ unfold seed
  ana unfold = embed . fmap unfold . unfold

What is this Corecursive typeclass?

ana  :: forall z. (z -> f z) -> z -> t

You might have noticed a duality between

\forall z. (z \rightarrow f~z) \rightarrow z \rightarrow t
\forall z. (z, z \rightarrow f~z) \rightarrow t
(\exists z. (z, z \rightarrow f~z)) \rightarrow t

It turns out that this is an isomorphism if our type is lazy.

(\exists z. (z, z \rightarrow f~z)) \leftrightarrow t
(\exists z. (z, z \rightarrow f~z)) \leftrightarrow t
data Nu f = forall z. Nu z (z -> f z)

to :: Cfg -> Nu CfgF 
to expr = Nu expr (\t -> project t)

from :: Nu CfgF -> Cfg
from (Nu seed unfold) = go s where 
	go s = embed $ fmap go $ unfold s
	go   = embed . fmap go . unfold

We can show that it is an isomorphism by constructing the to and from functions

Let's talk indexing

data Tree = Lit Int | Arr [Tree]
data TreeF t = Lit Int | Arr [t]

Consider this simplified version of Json

tree :: Tree
tree = Arr [
	Arr [
    	Lit 3,
        Lit 4
    ],
    Lit 5
]

Suppose I want to identify a particular element within that tree...

Here's one possible value of this type:

data Tree = Lit Int | Arr [Tree]
data TreeF t = LitF Int | ArrF [t]

tree :: Tree
tree = Arr [
	Arr [
    	Lit 3,
        Lit 4
    ],
    Lit 5
]

Suppose I want to identify a particular element within that tree:

data Index = Index [Int]

-- partial!
index :: Index -> Tree -> Int
index (Index []) (Lit i) = i
index (Index (x : xs)) (Arr list) = index (Index xs) (list !! x)

We can represent indices into our data using:

data Index = Index [Int]

-- partial!
index :: Index -> Tree -> Tree
index (Index [])        tree      = tree
index (Index (x : xs)) (Arr list) = index (Index xs) (list !! x)

A bit more general, we can return a node instead of assuming that it must be a literal

data IndexF = ...?

data TreeIndex = TreeIndex [IndexF]

-- partial!
index :: Index -> Tree -> Tree
index (Index [])        tree = tree
index (Index (x : xs))  tree = 
    index (Index xs) subtree
    where subtree = indexF x $ project list

How can we generalize this pattern?

Now remember the Tree ≈ TreeF Tree isomorphism

data Tree = Lit Int | Arr Tree Tree
data TreeF t = LitF Int | ArrF t t

Let's first simplify our structure a bit:

data IndexF = ...?

data TreeIndex = TreeIndex [IndexF]

-- partial!
index :: Index -> Tree -> Tree
index (Index [])        tree = tree
index (Index (x : xs))  tree = 
    index (Index xs) subtree
    where subtree = indexF x $ project list

How do we find IndexF?

data TreeF t = Lit Int | Arr t t
data IndexF  = ...?

IndexF represents a location of a t inside of TreeF t

data TreeF t = Lit Int | Arr t t
data IndexF  = Arr1 | Arr2

indexF :: IndexF -> TreeF z -> z
indexF Arr1 (Arr x _) = x
indexF Arr2 (Arr _ x) = x

How do we deal with errors?

data TreeF t     = Lit Int | Arr t t
data IndexF      = Arr1 | Arr2
data IndexErrorF = FoundLit

indexF :: IndexF -> TreeF z -> Either IndexErrorF z
indexF _    (Lit i)   = Left FoundLit
indexF Arr1 (Arr x _) = Right x
indexF Arr2 (Arr _ x) = Right x

index :: [IndexF] -> Tree -> Either ([IndexF], IndexErrorF) z
index = go [] where
    go path []       t = t
    go path (x : xs) t = 
        case (indexF x $ project t) of
            Left (path1, e) -> Left (path ++ path1, e)
            Right z         -> Right z

        

Can we mechanically derive IndexF from TreeF?

data TreeF t     = Lit Int | Arr t t
data IndexF      = Arr1 | Arr2
data IndexErrorF = FoundLit

Yes!

i[ (f x) * (g x) ] = i[f x] + i[g x]

i[ (f x) + (g x) ] = i[f x] + i[g x]

i[ x ] = 1

i[ y ] = 0 where x != y

i[ TreeF x ] = i[ Int + x * x ] = i[Int] + i[x * x] = 0 + 1 + 1 = 2

Can we mechanically derive IndexErrorF from TreeF?

data ExprF t     = Lit Int | Add t t | Mul t t
data IndexF      = Add1 | Add2 | Mul1 | Mul2
data IndexErrorF = LitNotAdd | LitNotMul | AddNotMul | MulNotAdd

Yes. The question is how to define it in a nice algebraic form.

Let's talk derivatives

data List a = [] | (:) a (List a)

list = [1, 2, 3, 4, 5, 6]

data Zipper a = NonEmpty [a] a [a]

start (x : xs) = NonEmpty [] x xs

left  (NonEmpty (x : xs) c ys) = NonEmpty  xs      x (c : ys)

right (NonEmpty xs c (y : ys)) = NonEmpty (c : xs) y (y : ys)

current (NonEmpty xs c ys)     = c

replace (NonEmpty xs _ ys) n   = NonEmpty xs n ys

How do we generalize this structure to arbitrary recursive data type?

How do we generalize this structure to arbitrary recursive data type?

Conceptually, zippers (a.k.a. derivatives) select a single value from a data structure within a certain context

(x, df x) f x 

If we have a tuple, (x, x), there are two ways to select an element of type x:

f x = (x, x)

df x = SelectFirst (second :: x) | SelectSecond (first :: x)

df x = Either x x

df x = x + x

Conceptually, zippers (a.k.a. derivatives) select a single value from a data structure within a certain context

(x, df x) f x 

If we have a tuple, (f1 x, f2 x), then:

f x = (f1 x, f2 x)

df x = (df1 x, f2 x) + (f1 x, df2 x)

If we have an either, (f1 x) + (f2 x), then:

f x = (f1 x) + (f2 x)

df x = (df1 x) + (df2 x)

Similar to our derivation of indices, we define derivatives purely algebraically

Δ[ (f x) * (g x) ] = Δ[f x] * (g x) + (f x) * Δ[g x]

Δ[ (f x) + (g x) ] = Δ[f x] + d[g x]

Δ[ x ] = 1

Δ[ y ] = 0 where y != x

data List x = [] | (:) x (List x)

Δ[ (f x) * (g x) ] = Δ[f x] * (g x) + (f x) * Δ[g x]

Δ[ (f x) + (g x) ] = Δ[f x] + Δ[g x]

Δ[ x ] = 1

Δ[ y ] = 0 where x != y

Δ[List x] = Δ[1 + x * (List x)] =
=1 * (List x) + x *  Δ[List x]

Δ[List x] (1 - x) = List x

Δ[List x] = List x / (1 - x)

Δ[List x] (1 - x) = List x

Δ[List x] = List x / (1 - x)

List x = 1 + x * (List x)

(List x) * (1 - x) = 1

List x = 1/(1-x)

Δ[List x] = List x * List x

data Zipper a = NonEmpty [a] a [a]

(x, Δ[List x]) = (x, List x * List x)

But why do we care?

Type derivatives are amazing, but what I am interested in is how they interact with pattern functors

class (Functor f, Functor d) => Recursive t f d | t -> f, t -> d where
  project :: t -> f t
  projectD :: t -> f (d t, t)
  cata    :: (f z -> z) -> t -> z
  
  cataH :: ([d t] -> f (z, t) -> z) -> [d t] -> t -> z
  cataH alg = go where
    go hist = alg hist . fmap (\(d, e) -> (go (d : hist) e, e)) . projectD

Summing it all up

folds ≈ Recursion Schemes ≈ Final Tagless (simple algebras)

Type algebra

まとめ

Summing it all up

Isomorphism to the Initial algebra carrier

Isomorphism to the product of all observable properties

まとめ

Summing it all up

Fix, Mu, Nu

Graph representation

まとめ

Indexing

Derivatives

Q&A

Alexander Konovalov, Compellon Inc

alex.knvl@gmail.com @alexknvl

alexknvl.com

On Recursive Data Types

By Alexander Konovalov

On Recursive Data Types

  • 349
Loading comments...

More from Alexander Konovalov