Type
Inference
Prototyping
Engines from
Relational Specifications of
                               Type Systems

안기영 Ki Yung Ahn

2016-02-18 Tuesday (목) Milestone Talk

한국정보과학회 프로그래밍언어연구회 동계 워크숍 (KIISE SIGPL Winter Workshop 2016)
전주 전북대학교 박물관 (Jeonbuk National University Museum, Jeonju, Jeollabuk-do, Korea)

\Gamma
Γ\Gamma
\vdash
\vdash
:
::
\forall
\forall
\lambda
λ\lambda

best viewed on Chrome, available online at https://slides.com/kyagrd/tiper-sigpl2016ko-revised​

Outline

  • Introduction

  • Relational Specifications

  • Logic Programming

  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

Outline

  • Introduction

    • Type Systems being re-invented

    • Problem & Solution?

  • Relational Specifications

  • Logic Programming

  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

Type Systems being re-invented

Lack of automated tools for building type systems

High development cost to adopt innovations from type theory & PL research

Inflexible and/or Verbose static type systems in mainstream langauges

Static Types
considered harmful, let's
use Dynamic Languages!

Oops, too painful to refactor/API-update
 without static types

Okay, let's add static types
Flow type checker for JavaScript, TypeScript, mypy, Typed Clojure, Typed Lua,  ... ...,  and all those fancy projects on gradual typing

trending in the real world

Type Systems need to be
Flexible & Succinct

(highly Polymorphic)

Problem

(good Type Inference)

difficult to implement good type inference

for highly polymorphic type systems

Solution?

automatically generate implementations
from type system specifications

What is TIPER?

Lex/Yacc : Parsers

TIPER : Type Systems

the missing automation tool
in langauge frontend construction

Outline

  • Introduction

  • Relational Specifications

    • Problems with Algorithmic/Functional spec.
    • Example of Relational Spec. using Prolog
  • Logic Programming

  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

Type System on paper

Type System implementation (algorithmic/functional description)

\frac{x:A\;\in\;\Gamma}{ \Gamma\;\vdash\;x\;:\;A}
x:AΓΓx:A\frac{x:A\;\in\;\Gamma}{ \Gamma\;\vdash\;x\;:\;A}
\frac{\Gamma,\,x:A\;\vdash\;e\,:\,B}{ \Gamma\;\vdash\;\lambda x.e\;:\;A\to B}
Γ,x:Ae:BΓλx.e:AB\frac{\Gamma,\,x:A\;\vdash\;e\,:\,B}{ \Gamma\;\vdash\;\lambda x.e\;:\;A\to B}
\frac{ \Gamma\;\vdash\;e_1\,:\,A\to B \qquad \Gamma\;\vdash\;e_2\,:\,A }{ \Gamma\;\vdash\;e_1\;e_2\;:\;B }
Γe1:ABΓe2:AΓe1e2:B\frac{ \Gamma\;\vdash\;e_1\,:\,A\to B \qquad \Gamma\;\vdash\;e_2\,:\,A }{ \Gamma\;\vdash\;e_1\;e_2\;:\;B }
tyChk :: (Ctx, Exp, Type) -> Bool
tyChk(g, Var x, a)         = (x,a) `elem` g 
tyChk(g, Abs x e, Arr a b) = tyChk((x,a):g, e, b)
tyChk(g, App e1 e2, b)     = case tyInf(g,e1) of { Arr a b -> tyChk(g,e2,a)
                                                 ; _       -> False } 
tyInf :: (Ctx, Exp) -> Maybe Type
tyInf(g, Var x)     = lookup x gamma
tyInf(g, Abs x e)   = ... -- actually need some more magic here
tyInf(g, App e1 e2) = case tyInf(gamma,e1) of { Arr a b
                                                  | tyChk(g,e2,a) -> Just b
                                              ; _                 -> Nothing }

(Var)

(Abs)

(App)

Type System on paper

Problems with Algorithmic/Functional description

\frac{x:A\;\in\;\Gamma}{ \Gamma\;\vdash\;x\;:\;A}
x:AΓΓx:A\frac{x:A\;\in\;\Gamma}{ \Gamma\;\vdash\;x\;:\;A}
\frac{\Gamma,\,x:A\;\vdash\;e\,:\,B}{ \Gamma\;\vdash\;\lambda x.e\;:\;A\to B}
Γ,x:Ae:BΓλx.e:AB\frac{\Gamma,\,x:A\;\vdash\;e\,:\,B}{ \Gamma\;\vdash\;\lambda x.e\;:\;A\to B}
\frac{ \Gamma\;\vdash\;e_1\,:\,A\to B \qquad \Gamma\;\vdash\;e_2\,:\,A }{ \Gamma\;\vdash\;e_1\;e_2\;:\;B }
Γe1:ABΓe2:AΓe1e2:B\frac{ \Gamma\;\vdash\;e_1\,:\,A\to B \qquad \Gamma\;\vdash\;e_2\,:\,A }{ \Gamma\;\vdash\;e_1\;e_2\;:\;B }

(Var)

(Abs)

(App)

  • Gap from the original specification on paper
  • Duplication inevitable (type check and type infer )

Type System on paper

Relational Specification

\frac{x:A\;\in\;\Gamma}{ \Gamma\;\vdash\;x\;:\;A}
x:AΓΓx:A\frac{x:A\;\in\;\Gamma}{ \Gamma\;\vdash\;x\;:\;A}
\frac{\Gamma,\,x:A\;\vdash\;e\,:\,B}{ \Gamma\;\vdash\;\lambda x.e\;:\;A\to B}
Γ,x:Ae:BΓλx.e:AB\frac{\Gamma,\,x:A\;\vdash\;e\,:\,B}{ \Gamma\;\vdash\;\lambda x.e\;:\;A\to B}
\frac{ \Gamma\;\vdash\;e_1\,:\,A\to B \qquad \Gamma\;\vdash\;e_2\,:\,A }{ \Gamma\;\vdash\;e_1\;e_2\;:\;B }
Γe1:ABΓe2:AΓe1e2:B\frac{ \Gamma\;\vdash\;e_1\,:\,A\to B \qquad \Gamma\;\vdash\;e_2\,:\,A }{ \Gamma\;\vdash\;e_1\;e_2\;:\;B }

(Var)

(Abs)

(App)

  • Reduces the gap from the description on paper
  • Single Source of Truth (Don't Repeat Yourself )

Relational Spec. using Prolog

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

type(C,var(X),       T) :- first(X:T,C).
type(C,lam(X,E),A -> B) :- type([X:A|C], E,  B).
type(C,E1 $ E2,      B) :- type(C,E1,A->B),
                           type(C,E2,A).

first(K:V,[K1:V1|Xs]) :- K = K1, V = V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).
\frac{x:A\;\in\;C}{ C\;\vdash\;x\;:\;A}
x:ACCx:A\frac{x:A\;\in\;C}{ C\;\vdash\;x\;:\;A}
\frac{C,\,x:A\;\vdash\;e\,:\,B}{ C\;\vdash\;\lambda x.e\;:\;A\to B}
C,x:Ae:BCλx.e:AB\frac{C,\,x:A\;\vdash\;e\,:\,B}{ C\;\vdash\;\lambda x.e\;:\;A\to B}
\frac{ C\;\vdash\;e_1\,:\,A\to B \quad C\;\vdash\;e_2\,:\,A }{ C\;\vdash\;e_1\;e_2\;:\;B }
Ce1:ABCe2:ACe1e2:B\frac{ C\;\vdash\;e_1\,:\,A\to B \quad C\;\vdash\;e_2\,:\,A }{ C\;\vdash\;e_1\;e_2\;:\;B }

Outline

  • Introduction

  • Relational Specification

  • Logic Programming

    • Why LP?
    • Why not something else?
  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

Logic Programming

  • Executable Relational Specification

  • Semantics exist for LP languages

    • incrementally builds up a substitution

    • can inspect intermediate results (not a black box) 

  • Unification is a primitive operation in LP

    • basic building block for type inference algorithms 

Why not something else?

  • Possible choices for Relational Spec.

    • Inductive Definitions in Interactive Theorem Provers

    • Logic Programming

    • Constraint Solvers / Automated Theorem Provers

  • Inductive Defs in ITPs are good for proofs

    • Not best suited for execution
  • Solvers and ATPs are generally black box

    • No semantics, hard to inspect intermediate result

    • Solvers are usually difficult to extend

Outline

  • Introduction

  • Relational Specification

  • Logic Programming

  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

Simply-Typed Lambda Calculus

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

type(C,var(X),       T) :- first(X:T,C).
type(C,lam(X,E),A -> B) :- type([X:A|C], E,  B).
type(C,E1 $ E2,      B) :- type(C,E1,A->B),
                           type(C,E2,A).

first(K:V,[K1:V1|Xs]) :- K = K1, V = V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).
\frac{x:A\;\in\;C}{ C\;\vdash\;x\;:\;A}
x:ACCx:A\frac{x:A\;\in\;C}{ C\;\vdash\;x\;:\;A}
\frac{C,\,x:A\;\vdash\;e\,:\,B}{ C\;\vdash\;\lambda x.e\;:\;A\to B}
C,x:Ae:BCλx.e:AB\frac{C,\,x:A\;\vdash\;e\,:\,B}{ C\;\vdash\;\lambda x.e\;:\;A\to B}
\frac{ C\;\vdash\;e_1\,:\,A\to B \quad C\;\vdash\;e_2\,:\,A }{ C\;\vdash\;e_1\;e_2\;:\;B }
Ce1:ABCe2:ACe1e2:B\frac{ C\;\vdash\;e_1\,:\,A\to B \quad C\;\vdash\;e_2\,:\,A }{ C\;\vdash\;e_1\;e_2\;:\;B }

Simply-Typed Lambda Calculus

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

type(C,var(X),       T) :- first(X:T,C).
type(C,lam(X,E),A -> B) :- type([X:A|C], E,  B).
type(C,E1 $ E2,      B) :- type(C,E1,A->B),
                           type(C,E2,A).

first(K:V,[K1:V1|Xs]) :- K = K1, V = V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).
?- type([], lam(x,var(x)), A->A). % type checking
true .

?- type([], lam(x,var(x)), T).    % type inference
T = (_G123->_G123) .

?- type([], E,             A->A). % type inhabitation
E = lam(_G234,var(_G234)) .

HM = STLC + Type Poly.

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

type(C,var(X),       T1) :- first(X:T,C), instantiate(T,T1).
type(C,lam(X,E), A -> B) :- type([X:mono(A)|C],E,B).
type(C,E1 $ E2,      B ) :- type(C,E1,A -> B), type(C,E2,A).
type(C,let(X=E0,E1), T ) :- type(C,E0,A),
                            type([X:poly(C,A)|C],E1,T).

first(K:V,[K1:V1|Xs]) :- K = K1, V=V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).

instantiate(mono(T),T).
instantiate(poly(C,T),T1) :- copy_term(t(C,T),t(C,T1)).
  • Type binding X:A in STLC corresponds to X:mono(A) in HM
  • poly(C,A) is a type scheme of A closed under the context C
  • Instationation implemented by Prolog's built-in copy_term

HM = STLC + Type Poly.

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

type(C,var(X),       T1) :- first(X:T,C), instantiate(T,T1).
type(C,lam(X,E), A -> B) :- type([X:mono(A)|C],E,B).
type(C,E1 $ E2,      B ) :- type(C,E1,A -> B), type(C,E2,A).
type(C,let(X=E0,E1), T ) :- type(C,E0,A), type([X:poly(C,A)|C],E1,T).

first(K:V,[K1:V1|Xs]) :- K = K1, V=V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).

instantiate(mono(T),T).
instantiate(poly(C,T),T1) :- copy_term(t(C,T),t(C,T1)).
?- copy_term(t([],A->B), t([],T)).     
T = (_G993->_G994).  % fresh vars: _G993, _G994 for A, B 

?- copy_term(t([x:A],A->B), t([x:A],T)).  
T = (A->_G1024).     % fresh vars: _G1024 for B only

Type Constructor Polymorphism

(a.k.a. higher-kinded polymorphism)

  -- Tree :: (* -> *) -> * -> *
data Tree    c           a
  = Leaf a              -- Leaf :: Tree c a
  | Node (c (Tree c a)) -- Node :: (c(Tree c a)) -> Tree c a

type BinTree a = Tree Pair a   -- two children on each node
type Pair t = (t,t)

type RoseTree a = Tree List a  -- varying number of children
type List a = [a]
(1)~\forall X^{*}.X \to X
(1) X.XX(1)~\forall X^{*}.X \to X
(2)~\forall X^{*}.\forall Y^{*}.\forall F^{*\to*}.(X \to Y) \to F\,X \to F\,Y
(2) X.Y.F.(XY)FXFY(2)~\forall X^{*}.\forall Y^{*}.\forall F^{*\to*}.(X \to Y) \to F\,X \to F\,Y

HM only supports type polymorphism such as (1) but
not higher-kinded poly. such as (2) supported in Haskell

HM + TyCon Poly.

:- set_prolog_flag(occurs_check,true).
:- op(500,yfx,$).

kind(KC, var(X), K) :- first(X:T,KC).
kind(KC, F $ G, K2) :- kind(KC,F,K1->K2), kind(KC,G,K1).
kind(KC, A -> B, o) :- kind(KC,A,o), kind(KC,B,o).

type(KC,C,var(X),       T1) :- first(X:T,C), instantiate(T,T1).
type(KC,C,lam(X,E), A -> B) :- type([X:mono(A)|C],E,B), kind(KC,A->B,o).
type(KC,C,E1 $ E2,      B ) :- type(KC,C,E1,A -> B), type(KC,C,E2,A).
type(KC,C,let(X=E0,E1), T ) :- type(KC,C,E0,A),
                               type(KC,[X:poly(C,A)|C],E1,T).

first(K:V,[K1:V1|Xs]) :- K = K1, V=V1.
first(K:V,[K1:V1|Xs]) :- K\==K1, first(K:V, Xs).

instantiate(mono(T),T).
instantiate(poly(C,T),T1) :- copy_term(t(C,T),t(C,T1)).

(a specification which kind of works but not really ...)

Note. Not a correct spec. for TyCon Poly. but just for demo.
            The "instantiate" predicate should be modified too.

HM + TyCon Poly.

(a specification which kind of works but not really ...)

?- type(KC, [], lam(x,var(x)), T).       
KC = [_G1578:_G1581|_G1584],   
T = (var(_G1578)->var(_G1578)) .
% OK, got the most general type at first :)

?- type(KC, [], lam(x,lam(y,var(x))), T).
KC = [_G1598:_G1601|_G1604],
T = (var(_G1598)->var(_G1598)->var(_G1598)) ;
KC = [_G1598:_G1601, _G1598:_G1612|_G1618],
T = (var(_G1598)->var(_G1598)->var(_G1598)) ;
KC = [_G1598:_G1601, _G1598:_G1612|_G1618],
T = (var(_G1598)->var(_G1598)->var(_G1598)) ;
KC = [_G1598:_G1601, _G1609:_G1612|_G1618],
T = (var(_G1609)->var(_G1598)->var(_G1609))
% Well, the most general type in 4th solution :(

gets even more unpleasant for HM + TyCon Poly + Kind Poly + ...

Workaround for extensions of HM

Summary

  • Preliminary results using Prolog (FLOPS 2016) shows
    • Logic Programming can be effective for
      Executable Relational Specifications of type systems
    • but Prolog is not a perfect for this purpose
      • eager depth-first search not always good
      • need more systematic support for resolving goals at conceptually different levels (e.g. type, kind, ...)
      • need order irrelevant unification over sets & maps to properly support extensible records with Row Poly.
      • may be neat to have lazy coinductive resolution
  • TIPER aims to support auto-generation of type system implementations from their relational specifications

Outline

  • Introduction

  • Relational Specification

  • Logic Programming

  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

(a) Experiment / Research

  • Develop type system specs
    using off-the-shelf LP tools
  • Identify limitations of the existing LP tools & libs for type system specification
  • Investigate theories to help overcome those limitations

(b) Tool Design & Impl.

  • Parser integration
  • Error handling
  • Frontend: surface lang. with Prolog-like syntax 
  • Backend: portable LP eDSL (e.g. μKanren) possible to target multiple language environments

Activities of the TIPER Project

continuous integration by multiple iterations of (a) and (b)

Progress / Ongoing Work

  • Polymorphic Type System Specifications using Prolog
    • HM + Type Constructor Poly. + Kind Poly. + etc.
    • HM + TyCon Poly. + Kind Poly. + Row Poly
      • type inference with extensible records
      • inference only, not supporting type annotation
  • Exploring microKanren
  • Investigating ingredients for Extensible Records

Plans for TIPER

  • Features to support
    • Polymorphisms over Type / Type Constructor / Kind
    • Extensible Records with Row Polymorphism
    • First-Class Polymorphism and Modules
    • Some FL features such as Type Classes and GADTs
    • Some OOPL features such as subtyping
  • Architecture
    • Prolog-like syntax frontend
    • parser integration & error handling
    • portable backend of eDSL for LP (e.g. microKanren)
  • Theories to investigate
    • resolution semantics (coinductive)
    • search strategies
    • handling extra-logical features 

Outline

  • Introduction

  • Relational Specification

  • Logic Programming

  • Preliminary Results using Prolog

  • TIPER project - Progress & Plans

  • Related Work

Related Work

  • Embedded DSL for LP

    • miniKanren and microKanren ( http://miniKanren.org ) ported to more than a dozen of programming languages

  • Coinductive flavors of LP

    • Type Inference by Coinductive Logic Programming
      (Ancona, Lagorio, Zucca 2009)   in post TYPES 2008

    • Proof Relevant Corecursive Resolution  in FLOPS 2016
      (Fu, Komendantskaya, Schrijvers, Pond 2016) 

  • Delimited Continuations for Prolog    in ICLP 2013 
    (Schrijvers, Demoen, Desouter, Wielemaker 2013)

  • Membership-Constraints and Complexity in Logic Programming with Sets  (Stolzenburg 1996)  in FroCoS 1996

Prior attempts in similar spirit

  • Executable Specification of Static Semantics   note: Typol
    (Despeyroux. 1984)  in Semantics of Data Types 1984

  • Extraction of Strong Typing Laws from Action Semantics Definitions (Doh, Schmidt. 1992)   in ESOP 1992

  • Type Inference with Constrained Types              note: HM(X)
    (Odersky, Sulzmann, Wehr. 1999)   TAPOS, 5(1):33-55

  • Type System for the Massses    in Onward 2015
    (Grewe, Erdweg, Wittmann, Mezini 2015)

  • And there are more frameworks for type checker development

    • TyS: a framework to facilitate the dev. of OO type checkers

    • Typical: Taking the Tedium Out of Typing

Conclusion

  • There have been
    • practical work on automated dev. of type checkers
      • mostly for lang. with no parametric polymorphism
      • sometimes demo HM example for inference showcase
    • automating "HM + constraints" at a pedagogical level but not including more adv. features (TyCon poly, Row poly.)
  • Proposed plans for the TIPER project is
    •  to build a practical framework
    • automating development of type checking & inference
    • supporting advanced polymorphic features 

Type Inference Prototyping Engines from Relational specifications of type systems

By 안기영 (Ahn, Ki Yung)

Type Inference Prototyping Engines from Relational specifications of type systems

A declarative and machine executable specification of the Hindley–Milner type system (HM) can be formulated using a Logic Programming (LP) language such as Prolog. Modern functional language implementations such as the GHC supports more advanced polymorphism beyond HM. We introduce the plans and progress of the TIPER project to push this idea of using LP for type system specification further towards a more practical language design/implementation tool. (Note: This is a milestone talk slide for a KIISE/SIGPL winter workshop 2016.)

  • 1,703