Datatype-Generic Programming (in Agda)
Lin Tzu-Chi
What do we mean by generic?*
- A single unified program
 - Abstraction from the differences in similar programs
 - Usually parametrization
 
* Jeremy Gibbons (2007): Datatype-generic Programming. In: Proceedings of the 2006 International Conference on Datatype-generic Programming, SSDGP’06, Springer-Verlag, Berlin, Heidelberg, pp. 1– 71, doi:10.1007/978-3-540-76786-2 1. Available at http://dl.acm.org/citation.cfm?id=1782894.1782895.
Examples of Generic Programs
Genericity by Value
- For loops and functions are generic.
 
#include<stdio.h>
int main() {
  printn(5);
  printn(10);
}
void printn(int n) {
  for (int i = 1; i <= n; i++) {
    for (int j = 0; j < i; j++)
      printf("*");
    printf("\n");
  }
}
    Function in C
printf("*");
printf("**");
printf("***");
printf("****");
printf("*****");
printf("*");
printf("**");
printf("***");
printf("****");
printf("*****");
printf("******");
printf("*******");
printf("********");
printf("*********");
printf("**********");5
10
Genericity by Type
- Non-generic definition: separate functions for List Int and List Char.
 
data ListN : Type where
  []  : ListN
  _∷_ : ℕ → List ℕ → List ℕ
append : List ℕ → List ℕ → List ℕ
append []       ys = ys
append (x ∷ xs) ys = x ∷ append xs ys
xs : List
xs = 0 ∷ 1 ∷ []
    Datatype and function definition in Agda (Haskell/OCaml/...)
data ListChar : Type where
  []  : ListChar
  _∷_ : Char → ListChar → ListChar
append : ListChar → ListChar → ListChar
append []       ys = ys
append (x ∷ xs) ys = x ∷ append xs ys
    Genericity by Type
- Generic definition: abstracts Int/Char from List Int/List Char.
 
data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
append : ∀ {A} → List A → List A → List A
append []       ys = ys
append (x ∷ xs) ys = x ∷ append xs ys
xs : List String
xs = "str1" ∷ "str2" ∷ [] 
    Parametrized Type in Agda (Haskell/OCaml/...)
Honorable Mentions
- Generics in Java is genericity by type
 - Genericity by structure
 - C++ Standard Template Library
 - Genericity by stage
	
- Metaprogramming
		
- C++ Templates
 - Template Haskell
 
 
 - Metaprogramming
		
 - Genericity by ...
 
Common Pattern of Generic Programming
- Identify a family of similar programs
 - Parametrize the part that is different from each others
 
Datatype-generic Programming (DGP)
We want similar functions on similar datatypes,
take map for example
data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f []       = []
mapList f (x ∷ xs) = f x ∷ mapList f xsmap apply f to each element in a list
Datatype-generic Programming (DGP)
- We can find similar map functions on similar types
	
- E.g. map on binary trees:
 
 
data Tree (A : Type) : Type where
  leaf : Tree A
  node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf           = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)Duplication Bad!
data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f []       = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
  leaf : Tree A
  node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf           = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
data RBTree (A : Type) : Color → Type where    
  leaf  : RBTree A Black   
  nodeR : A → RBTree A Black → RBTree A Black → RBTree A Red
  nodeB : {c1 c2 : Color}
        → A → RBTree A c1    → RBTree A c2    → RBTree A Black
mapRBTree : (A → B) → RBTree A c → RBTree B c
mapRBTree f leaf           = leaf
mapRBTree f (nodeR x t₁ t₂) = nodeR (f x) (mapRBTree f t₁) (mapRBTree f t₂)
mapRBTree f (nodeB x t₁ t₂) = nodeB (f x) (mapRBTree f t₁) (mapRBTree f t₂)1. Similarities between Tree and List
- Parametrized by an element type 
A - First constructor takes no parameter
 - Second constructor takes values of parameter 
Aand/or the type itself that is being defined. 
data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f []       = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
  leaf : Tree A
  node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf           = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)
    2. Both mapList and mapTree share the type
(A -> B) -> T A -> T B
data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f []       = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
  leaf : Tree A
  node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf           = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
mapList : (A → B) → List A → List B
mapList f []       = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
data Tree (A : Type) : Type where
  leaf : Tree A
  node : A → Tree A → Tree A → Tree A
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf           = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)3. Similarities between map definitions:
- 
	
One clause for each constructor
 - 
	
Result of a clause is constructed from the same constructor and values within the clause
- 
		
f is applied to the values of the parametrized type
 - 
		
map in question is applied recursively to the this datatype
 
 - 
		
 
Virtue of DGP
- 'Similarity' is captured formally
 - Establish shared properties (proof reuse), e.g.
 
mapList : (A → B) → List A → List B
mapList f []       = []
mapList f (x ∷ xs) = f x ∷ mapList f xs
mapTree : (A → B) → Tree A → Tree B
mapTree f leaf           = leaf
mapTree f (node x t₁ t₂) = node (f x) (mapTree f t₁) (mapTree f t₂)Proof reuse makes better reasoning and optimization.
Typeclass does not (satisfactorily) solve our problems!
class Functor f where
  fmap :: (a -> b) -> f a -> f b
  
instance Functor List where
  fmap f Nil         = Nil
  fmap f (Cons x xs) = Cons (f x) (fmap f xs)
  
instance Functor Tree where
  fmap f Nil            = Nil
  fmap f (Node x t1 t2) = Node (f x) (fmap f t1) (fmap f t2)- Typeclass is ad-hoc polymorphism, instead of parametric polymorphism
 - Similarities between definitions are not utilized
 
Cont 2. Is typeclass useful here?
map :: (Functor F) => (a -> b) -> F a -> F b.
Datatype-Generic Programming
(in Agda)
Requirements for DGP
- A generic representation for a family of datatypes
	
- for datatypes that support map (List, Tree...)
 
 - Corresponding definition for generic functions
 
A Datatype Family Representation
Some formerly mentioned datatype families can be represented:
Let's call it the polynomial representation,
which can be seem as shapes of constructors.
data List (A : Type) : Type where
  []  : List A
  _∷_ : A → List A → List A
  
data Tree (A : Type) : Type where
  leaf : Tree A
  node : A → Tree A → Tree A → Tree AGeneric Definitions by Polynomials
μ : Poly → Type → Typedata Mono : Type where
  ∅   : Mono
  I   : Mono
  E   : Mono
  _⊗_ : Mono → Mono → Mono
Poly : Type
Poly = List Mono
ListRep : Poly
ListRep = ∅ ∷ E ⊗ I ∷ []we can define μ which turn a representation into a datatype
Thanks to Agda's expressiveness, we can get a taste of the polynomial representation
Generic Definitions by Polynomials
μ : Poly → Type → Typedatatypes denoted by μ should behave the same with their native counterparts
length′ : {A : Type} → μ ListRep A → ℕ
length′ (con (inj₁ tt))              = 0
length′ (con (inj₂ (inj₁ (x , xs)))) = suc (length′ xs)
length : {A : Type} → List A → ℕ
length []       = 0
length (x ∷ xs) = suc (length xs)
append′ : {A : Type} → μ ListRep A → μ ListRep A → μ ListRep A
append′ (con (inj₁ tt)) ys              = ys
append′ (con (inj₂ (inj₁ (x , xs)))) ys = 
  con (inj₂ (inj₁ (x , (append′ xs ys))))
append : {A : Type} → List A → List A → List A
append [] ys       = []
append (x ∷ xs) ys = x ∷ append xs ysGeneric Definitions by Polynomials
map : (F : Poly) → {A B : Type} → (A → B) → μ F A → μ F B
map F {A} {B} f (con xs) = con (mapᴾ F xs)
  where
    mapᴹ : (M : Mono) → ⟦ M ⟧ᴹ (A , μ F A) → ⟦ M ⟧ᴹ (B , μ F B)
    mapᴹ ∅       tt        = tt
    mapᴹ E       a         = f a        -- apply f to an element
    mapᴹ I       x         = map F f x  -- recursive call
    mapᴹ (M ⊗ N) (xs , ys) = mapᴹ M xs , mapᴹ N ys
    mapᴾ : (G : Poly) → ⟦ G ⟧ (A , μ F A) → ⟦ G ⟧ (B , μ F B)
    mapᴾ (M ∷ G) (inj₁ xs) = inj₁ (mapᴹ M  xs)  -- preserving
    mapᴾ (M ∷ G) (inj₂ xs) = inj₂ (mapᴾ Ms xs)  -- constructor choiceA generic map function can thus be instantiated manually:
mapId : (F : Poly) (x : μ F A) → map F id x ≡ x
mapComp : (F : Poly) (f : B → C) (g : A → B) (x : μ F A)
        → map F (f ∘ g) x ≡ map F f (map F g x)Proofs can be established on generic definitions:
to/from
toList : ∀ {A} → μ ListF A → List A
toList con₁            = []
toList (con₂ (x , xs)) = x ∷ toList xs
fromList : ∀ {A} → List A → μ ListF A
fromList []       = con₁
fromList (x ∷ xs) = con₂ (x , (fromList xs))toTree : ∀ {A} → μ TreeF A → Tree A
toTree con₁                 = leaf
toTree (con₂ (x , xs , ys)) = node x (toTree xs) (toTree ys)
fromTree : ∀ {A} → Tree A → μ TreeF A
fromTree leaf           = con₁
fromTree (node x xs ys) = con₂ (x , (fromTree xs) , (fromTree ys))Native DGP with Metaprogramming
Native is better
- Readability
 - Interoperability
	
- between different representations
 - with existing libraries
 
 
A Naive Solution
We always have the conversion between generic and native definitions since they behave the same (isomorphic)
toList   : ∀ {A} → μ ListF A → List A
fromList : ∀ {A} → List A → μ ListF A
mapList : (A → B) → List A → List B
mapList f = toList ∘ map ListF f ∘ fromListProblem: Time & space inefficient, difficult to reason about, ...
We want translation at will!
Possible Solutions
- A new programming language design
 - A compiler redesign for eliminating intermediate structures
 - 
Metaprogramming
	
- Code generation & instrumentation
 
 
- Metaprogramming mechanism
 - Generic definition for generic definitions
 - Ornamentation
	
- Describing relations between datatypes
 
 
Existing Problems & Ongoing Work
Datatype-generic Programming
By zekt
Datatype-generic Programming
- 645