Datatype-Generic Programming (in Agda)
Lin Tzu-Chi
What do we mean by generic?*
- A single unified program
- Abstraction from the differences in similar programs
- Usually parametrization
* Jeremy Gibbons (2007): Datatype-generic Programming. In: Proceedings of the 2006 International Conference on Datatype-generic Programming, SSDGP’06, Springer-Verlag, Berlin, Heidelberg, pp. 1– 71, doi:10.1007/978-3-540-76786-2 1. Available at http://dl.acm.org/citation.cfm?id=1782894.1782895.
Examples of Generic Programs
Genericity by Value
- For loops and functions are generic.
#include<stdio.h>
int main() {
printn(5);
printn(10);
}
void printn(int n) {
for (int i = 1; i <= n; i++) {
for (int j = 0; j < i; j++)
printf("*");
printf("\n");
}
}
Function in C
printf("*");
printf("**");
printf("***");
printf("****");
printf("*****");
printf("*");
printf("**");
printf("***");
printf("****");
printf("*****");
printf("******");
printf("*******");
printf("********");
printf("*********");
printf("**********");
5
10
Genericity by Type
- Non-generic definition: separate functions for List Int and List Char.
data ListN : Type where
[] : ListN
_∷_ : ℕ → List ℕ → List ℕ
append : List ℕ → List ℕ → List ℕ
append [] ys = ys
append (x ∷ xs) ys = x ∷ append xs ys
xs : List
xs = 0 ∷ 1 ∷ []
Datatype and function definition in Agda (Haskell/OCaml/...)
data ListChar : Type where
[] : ListChar
_∷_ : Char → ListChar → ListChar
append : ListChar → ListChar → ListChar
append [] ys = ys
append (x ∷ xs) ys = x ∷ append xs ys
Genericity by Type
- Generic definition: abstracts Int/Char from List Int/List Char.
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
append : ∀ {A} → List A → List A → List A
append [] ys = ys
append (x ∷ xs) ys = x ∷ append xs ys
xs : List String
xs = "str1" ∷ "str2" ∷ []
Parametrized Type in Agda (Haskell/OCaml/...)
Honorable Mentions
- Generics in Java is genericity by type
- Genericity by structure
- C++ Standard Template Library
- Genericity by stage
- Metaprogramming
- C++ Templates
- Template Haskell
- Metaprogramming
- Genericity by ...
Common Pattern of Generic Programming
- Identify a family of similar programs
- Parametrize the part that is different from each others
Datatype-generic Programming (DGP)
We want similar functions on similar datatypes,
take map for example
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f [] = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
map apply f to each element in a list
Datatype-generic Programming (DGP)
- We can find similar map functions on similar types
- E.g. map on binary trees:
data Tree (A : Type) : Type where
leaf : Tree A
node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
Duplication Bad!
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f [] = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
leaf : Tree A
node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
data RBTree (A : Type) : Color → Type where
leaf : RBTree A Black
nodeR : A → RBTree A Black → RBTree A Black → RBTree A Red
nodeB : {c1 c2 : Color}
→ A → RBTree A c1 → RBTree A c2 → RBTree A Black
mapRBTree : (A → B) → RBTree A c → RBTree B c
mapRBTree f leaf = leaf
mapRBTree f (nodeR x t₁ t₂) = nodeR (f x) (mapRBTree f t₁) (mapRBTree f t₂)
mapRBTree f (nodeB x t₁ t₂) = nodeB (f x) (mapRBTree f t₁) (mapRBTree f t₂)
1. Similarities between Tree and List
- Parametrized by an element type
A
- First constructor takes no parameter
- Second constructor takes values of parameter
A
and/or the type itself that is being defined.
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f [] = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
leaf : Tree A
node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
2. Both mapList and mapTree share the type
(A -> B) -> T A -> T B
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f [] = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
leaf : Tree A
node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f [] = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
leaf : Tree A
node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
3. Similarities between map definitions:
-
One clause for each constructor
-
Result of a clause is constructed from the same constructor and values within the clause
-
f is applied to the values of the parametrized type
-
map in question is applied recursively to the this datatype
-
Virtue of DGP
- 'Similarity' is captured formally
- Establish shared properties (proof reuse), e.g.
mapList : (A → B) → List A → List B
mapList f [] = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
Proof reuse makes better reasoning and optimization.
Typeclass does not (satisfactorily) solve our problems!
class Functor f where
fmap :: (a -> b) -> f a -> f b
instance Functor List where
fmap f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
instance Functor Tree where
fmap f Nil = Nil
fmap f (Node x t1 t2) = Node (f x) (fmap f t1) (fmap f t2)
- Typeclass is ad-hoc polymorphism, instead of parametric polymorphism
- Similarities between definitions are not utilized
Cont 2. Is typeclass useful here?
map :: (Functor F) => (a -> b) -> F a -> F b.
Datatype-Generic Programming
(in Agda)
Requirements for DGP
- A generic representation for a family of datatypes
- for datatypes that support map (List, Tree...)
- Corresponding definition for generic functions
A Datatype Family Representation
Some formerly mentioned datatype families can be represented:
Let's call it the polynomial representation,
which can be seem as shapes of constructors.
data List (A : Type) : Type where
[] : List A
_∷_ : A → List A → List A
data Tree (A : Type) : Type where
leaf : Tree A
node : A → Tree A → Tree A → Tree A
Generic Definitions by Polynomials
μ : Poly → Type → Type
data Mono : Type where
∅ : Mono
I : Mono
E : Mono
_⊗_ : Mono → Mono → Mono
Poly : Type
Poly = List Mono
ListRep : Poly
ListRep = ∅ ∷ E ⊗ I ∷ []
we can define μ which turn a representation into a datatype
Thanks to Agda's expressiveness, we can get a taste of the polynomial representation
Generic Definitions by Polynomials
μ : Poly → Type → Type
datatypes denoted by μ should behave the same with their native counterparts
length′ : {A : Type} → μ ListRep A → ℕ
length′ (con (inj₁ tt)) = 0
length′ (con (inj₂ (inj₁ (x , xs)))) = suc (length′ xs)
length : {A : Type} → List A → ℕ
length [] = 0
length (x ∷ xs) = suc (length xs)
append′ : {A : Type} → μ ListRep A → μ ListRep A → μ ListRep A
append′ (con (inj₁ tt)) ys = ys
append′ (con (inj₂ (inj₁ (x , xs)))) ys =
con (inj₂ (inj₁ (x , (append′ xs ys))))
append : {A : Type} → List A → List A → List A
append [] ys = []
append (x ∷ xs) ys = x ∷ append xs ys
Generic Definitions by Polynomials
map : (F : Poly) → {A B : Type} → (A → B) → μ F A → μ F B
map F {A} {B} f (con xs) = con (mapᴾ F xs)
where
mapᴹ : (M : Mono) → ⟦ M ⟧ᴹ (A , μ F A) → ⟦ M ⟧ᴹ (B , μ F B)
mapᴹ ∅ tt = tt
mapᴹ E a = f a -- apply f to an element
mapᴹ I x = map F f x -- recursive call
mapᴹ (M ⊗ N) (xs , ys) = mapᴹ M xs , mapᴹ N ys
mapᴾ : (G : Poly) → ⟦ G ⟧ (A , μ F A) → ⟦ G ⟧ (B , μ F B)
mapᴾ (M ∷ G) (inj₁ xs) = inj₁ (mapᴹ M xs) -- preserving
mapᴾ (M ∷ G) (inj₂ xs) = inj₂ (mapᴾ Ms xs) -- constructor choice
A generic map function can thus be instantiated manually:
mapId : (F : Poly) (x : μ F A) → map F id x ≡ x
mapComp : (F : Poly) (f : B → C) (g : A → B) (x : μ F A)
→ map F (f ∘ g) x ≡ map F f (map F g x)
Proofs can be established on generic definitions:
to/from
toList : ∀ {A} → μ ListF A → List A
toList con₁ = []
toList (con₂ (x , xs)) = x ∷ toList xs
fromList : ∀ {A} → List A → μ ListF A
fromList [] = con₁
fromList (x ∷ xs) = con₂ (x , (fromList xs))
toTree : ∀ {A} → μ TreeF A → Tree A
toTree con₁ = leaf
toTree (con₂ (x , xs , ys)) = node x (toTree xs) (toTree ys)
fromTree : ∀ {A} → Tree A → μ TreeF A
fromTree leaf = con₁
fromTree (node x xs ys) = con₂ (x , (fromTree xs) , (fromTree ys))
Native DGP with Metaprogramming
Native is better
- Readability
- Interoperability
- between different representations
- with existing libraries
A Naive Solution
We always have the conversion between generic and native definitions since they behave the same (isomorphic)
toList : ∀ {A} → μ ListF A → List A
fromList : ∀ {A} → List A → μ ListF A
mapList : (A → B) → List A → List B
mapList f = toList ∘ map ListF f ∘ fromList
Problem: Time & space inefficient, difficult to reason about, ...
We want translation at will!
Possible Solutions
- A new programming language design
- A compiler redesign for eliminating intermediate structures
-
Metaprogramming
- Code generation & instrumentation
- Metaprogramming mechanism
- Generic definition for generic definitions
- Ornamentation
- Describing relations between datatypes
Existing Problems & Ongoing Work
Datatype-generic Programming
By zekt
Datatype-generic Programming
- 492