Alexander Gryzlov
(IMDEA Software Institute)
Lambda World 2025
October 24, Cádiz, Spain
As you can guess/deduce from the title, this is a talk about combining two topics I've been working on for the past couple of years.
AI is somewhat like "teenage music".
For every decade, it could mean something different:
Let us go back to the basics!
Formulated back in 1950s as "human-level activities performed by computer:"
We'll focus on the last one:
Here, symbolic (rather than statistical) methods are typically used, and precise guarantees matter the most.
There's a struggle between the power of the system and automation.
More versatile systems converge toward general-purpose programming languages.
We'll use an interactive tool (Agda) to build and verify some simple automated ones.
Shoham, [2019] "Verification of Distributed Protocols Using Decidable Logic"
The subfield of symbolic AI focused on theorem proving is referred to as "automated reasoning."
Typically involves:
The algorithms we'll see use first-order syntax:
no binders (like λ) inside terms.
An important notion when reasoning about variables: context.
This is what logicians/type theorists write as capital Greek letters (Γ/Δ).
A finite set of all variables in the expression.
An overapproximation - can include extra variables not in the term!
We'll use a special (quotient) type Ctx for sets of variables (the order and multiplicity in it don't matter).
Has the usual set operators and predicates: ∈, union, rem, minus
We're going to compute maps/functions, but can we be precise about them?
(Pure) functions in FP are close to the mathematical definition of a function.
But how do we guarantee the "each" and "exactly one" part?
a function from a set X to a set Y assigns to each element of X exactly one element of Y
Two problems beyond purity:
head :: [a] -> a
head [] = error "oops"
head (x:_) = xloop :: a -> a
loop x = loop xCan be costly in critical systems, we typically expect each request/component to finish, even if the system is interactive:
916 such CVE’s between 2000 and 2022
"Large-scale analysis of non-termination bugs in real-world OSS projects" (2022)
For reasoning algorithms, this means we always get an answer (though it may take a long time).
We can only write programs which:
The first part is relatively trivial (though often requires restructuring your program),
however the second involves recursion :(
Programs which "consume" syntactically smaller pieces of input:
data Tree : 𝒰 where
leaf : Tree
node : Tree → Tree → Tree
depth : Tree → ℕ
depth leaf = 0
depth (node l r) = 1 + max (depth l) (depth r)Here's a simple example that doesn't fit this pattern:
Euclid's GCD algorithm
{-# TERMINATING #-}
gcd : ℕ → ℕ → ℕ
gcd n m =
if m == 0
then n
else gcd m (n % m) We know n % m < m but this is not structural!
gcd(105, 30) → 105%30 = 15
gcd(30, 15) → 30%15 = 0
gcd(15, 0) = 15
We have to introduce a "measure" that decreases on a type T according to a well-founded order.
If you come up with any sequence ... < tx < ty < tz < ..., it cannot decrease forever.
A canonical type with such an order is (ℕ, <).
Any sequence will eventually end with 0.
Called a linear order, but orders can be branching.
Let's introduce a special type for this goal:
record □_ (A : T → 𝒰) (x : T) : 𝒰 where
field call : (x : T) → y < x → A y
-----------------
fix : (A : T → 𝒰)
→ ({t : T} → □ A t → A t)
→ ({t : T} → A t)
-- a non-total fixpoint would be
-- fix : (A → A) → A□A means "A can only be called with an argument smaller than its index".
If the order is well-founded, we can implement the fixed-point combinator!
-- implicit
fix : (A : T → 𝒰)
→ ({t : T} → □ A t → A t)
→ ({t : T} → A t)
-- sugar!
fix : (A : T → 𝒰)
→ ∀[ □ A ⇒ A ]
→ ∀[ A ] Sometimes we want to hide the decreasing argument (make it implicit), other
times it's crucial to computation.
-- explicit
fix : (A : T → 𝒰)
→ ((t : T) → □ A t → A t)
→ ((t : T) → A t)
-- sugar!
fix : (A : T → 𝒰)
→ Π[ □ A ⇒ A ]
→ Π[ A ] Here's an example of computing a GCD function like this (uses the explicit form):
gcd-ty : ℕ → 𝒰
gcd-ty x = (y : ℕ) → y < x → ℕ
gcd-loop : Π[ □ gcd-ty ⇒ gcd-ty ]
gcd-loop x rec y y<x =
caseᵈ y = 0 of
λ where
(yes y=0) → x
(no y≠0) →
rec .call
-- it is safe to do the recursive call
y<x (x % y)
-- remainder is smaller
(%-r-< x y
(≱→< $ contra ≤0→=0 y≠0))gcd< : Π[ gcd-ty ]
gcd< = fix gcd-ty gcd-loop
gcd : ℕ → ℕ → ℕ
gcd x y =
caseᵗ x >=< y of
λ where
(LT x<y) → gcd< y x x<y
(EQ x=y) → x
(GT y<x) → gcd< x y y<xTo kick-start the computation, we just need to decide which argument goes first:
Sometimes called "a Swiss army knife operator".
A family of (semi)algorithms for solving equations.
Match a pattern with gaps in it against data to fill the gaps.
Many applications:
First-order MGU: a classical form described in Pierce's TAPL.
Given a pair of terms (or generally a list of pairs),
find a substitution that makes all of them equal (or fail):
A ≟ A {}
A ≟ B FAIL
A ≟ x { x ↦ A }
A ≟ B ⊗ C FAIL
x ≟ B ⊗ C { x ↦ B ⊗ C }
x ⊗ B ≟ A ⊗ y { x ↦ A , y ↦ B }
x ≟ x ⊗ x FAIL
[ x ≟ y
, y ≟ A ] { x ↦ A , y ↦ A }data Term : 𝒰 where
``_ : Var → Term
_⊗_ : Term → Term → Term
sy : String → Term
-- x ⊗ B
example : Term
example = `` x ⊗ sy "B"
Constr : 𝒰
Constr = Term × TermWe need both internal and external substitution:
-- internal
sub1 : Var → Term → Term → Term
sub1 v t (`` x) =
if v == x then t else `` x
sub1 v t (p ⊗ q) =
sub1 v t p ⊗ sub1 v t q
sub1 v t (sy s) =
sy s
subs1 : Var → Term → List Constr → List Constr
-- external
Sub : 𝒰
Sub = Map Var Termunify : List Constr → Maybe Subst
unify [] = just emptyM
unify ((tl, tr) ∷ cs) =
if tl == tr
then unify cs
else unifyHead tl tr cs
unifyHead : Term → Term
→ List Constr → Maybe Subst
unifyHead (`` v) tr cs =
if occurs v tr then nothing
else map (insertM v tr) $
unify (subs1 v tr cs)
unifyHead tl (`` v) cs =
... -- symmetrical
unifyHead (lx ⊗ ly) (rx ⊗ ry) cs =
unify ((lx , rx) ∷ (ly , ry) ∷ cs) -- adds constraints!
unifyHead _ _ _ =
nothingtm-size : Term → ℕ
tm-size (p ⊗ q) = 1 + tm-size p + tm-size q
tm-size _ = 1
tm-sizes : List Constr → ℕSolution: combine 1 and 2!
We can combine two (or generally N) well-founded orders:
(a,b) < (x,y) := a < x OR (a = x AND b < y)
One component always decreases!
For unification this means:
Input : 𝒰
Input = Ctx × List Constr
wf-tm : Ctx → Term → 𝒰
wf-tm c t = vars t ⊆ c
wf-input : Input → 𝒰
-- each term in the constraint
-- list is WFunify-ty : ℕ × ℕ → 𝒰
unify-ty (x , y) =
(inp : Input)
→ wf-input inp
→ x = size (inp .fst)
→ y = term-sizes (inp .snd)
→ Maybe Sub
-- before
unifyHead (lx ⊗ ly) (rx ⊗ ry) cs =
unify ((lx , rx) ∷ (ly , ry) ∷ cs)
-- after
unify-head-loop rec (ctx , cs) wf (lx ⊗ ly) (rx ⊗ ry) wl wr ex ey =
rec .call prf-<
(ctx , ls') prf-wf
refl refl
where
cs' : List Constr
cs' = (lx , rx) ∷ (ly , ry) ∷ cs
prf-< : (size ctx , term-sizes cs') < (size ctx , term-sizes cs)
prf-< = ...
prf-wf : wf-input (ctx , cs')
prf-wf = ...Classical constraint satisfaction task:
Given a boolean formula with variables,
find an assignment of variables that makes it true (or fail).
Compared to unification:
-- tautologies (true for every assignment)
True
P ∧ Q ⇒ P ∨ Q
((P ⇒ Q) ⇒ P) ⇒ P -- aka Peirce's law
-- satisfiable (there is an assignment)
P ∧ Q ⇒ Q ∧ R -- P = Q = True, R = False
-- simplifes to True ⇒ False
-- unsatisfiable
P ∧ ¬P
False
Generally, in the worst case, no! :(
Cook–Levin theorem (1971): SAT is NP-complete
(btw, this is the birth of NP-completeness concept).
But some heuristics can make less-than-worst cases tractable.
Davis–Putnam–Logemann–Loveland algorithm, 1961
DPLL and similar algorithms assume the input is in a
conjunctive normal form: a big conjunction of disjunctions of possibly negated literals (clauses)
( A ∨ ¬B)
∧ (¬C ∨ D ∨ E)
∧ ... We can always transform into one thanks to boolean reasoning principles (i.e. DeMorgan rule: ¬(P ∨ Q) = ¬P ∧ ¬Q)
True ~ ∅
False ~ ()
P ∨ Q ~ (P ∨ Q)
P ∧ Q ~ (P) ∧ (Q)
P ∧ Q ⇒ Q ∧ R ~ (¬P ∨ ¬Q ∨ R)data Lit (Γ : Ctx) : 𝒰 where
Pos : (v : Var) → v ∈ Γ → Lit Γ
Neg : (v : Var) → v ∈ Γ → Lit Γ
var : Lit Γ → Var
var (Pos v _) = v
var (Neg v _) = v
positive : Lit Γ → Bool
positive (Pos _ _) = true
positive _ = falseUnlike in unification, we push the well-formedness constraint into the literals:
Clause : Ctx → 𝒰
Clause Γ = List (Lit Γ)
CNF : Ctx → 𝒰
CNF Γ = List (Clause Γ)
literals : CNF Γ → List (Lit Γ)
literals = nub ∘ concatunit-clause : CNF Γ → Maybe (Lit Γ)
unit-clause [] = nothing
unit-clause ( [] ∷ c) = unit-clause c
unit-clause ((x ∷ []) ∷ c) = just x
unit-clause ((_ ∷ _ ∷ _) ∷ c) = unit-clause c
unit-propagate : (l : Lit Γ) → CNF Γ → CNF (rem (var l) Γ)
unit-propagate l [] = []
unit-propagate l (f ∷ c) =
if has l f
then unit-propagate l c
else delete-var (var l) f ∷ unit-propagate l c
one-lit-rule : CNF Γ → Maybe (Σ[ l ꞉ Lit Γ ] (CNF (rem (var l) Γ)))
one-lit-rule cnf =
map (λ l → l , unit-propagate l cnf) $
unit-clause cnfHeuristic 1: If a clause consists of a single literal, it must be true, propagate it through the formula:
(A ∨ ¬B)
∧ (C)
∧ ... pure-literal-rule :
(c : CNF Γ)
→ (Σ[ purelits ꞉ List (Lit Γ) ]
(let vs = map var purelits in
(vs ≬ Γ) × CNF (minus Γ vs)))
⊎ (∀ {l} → l ∈ literals c → negate l ∈ literals c)
...Heuristic 2 (aka affirmative-negative rule):
The idea is to delete every literal that occurs strictly positively or strictly negatively (purely).
posneg-count : CNF Γ → Lit Γ → ℕ
posneg-count cnf l =
let m = count (has l) cnf
n = count (has $ negate l) cnf
in
m + n
splitting-rule : (c : CNF Γ)
→ Any positive (literals c)
→ Lit Γ
The previous two rules try to eliminate guessing as much as possible, but eventually, we're going to have to guess a value.
This is where backtracking still happens.
The splitting rule guarantees a result after the pure literal one.
Answer = Map Var Bool
dpll-loop : (CNF Γ → Maybe Answer)
→ CNF Γ → Maybe Answer
dpll-loop rec cnf =
if null? cnf then just emptyM -- trivially true
else if has [] cnf then nothing -- trivially false
else
maybe
(maybe
(let l = splitting-rule cls in
map (either (insertLit l)
(insertLit (negate l))) $
rec (unit-propagate l cnf)
<+> rec (unit-propagate (negate l) cnf))
(λ (ls , c) → map (insertLits ls) $ rec c)
(pure-literal-rule cnf))
(λ (l , c) → map (insertLit l) $ rec c)
(one-lit-rule cnf)
Context always decreases:
This is actually simpler than unification!
We don't need the lexicographic pair, just a single number:
DPLL-ty : ℕ → 𝒰
DPLL-ty x =
{Γ : Ctx}
→ x = size Γ
→ CNF Γ → Maybe Answer
...
(map (either (insertLit l)
(insertLit (negate l))) $
rec (unit-propagate l cls)
<+> rec (unit-propagate (negate l) cls))
...data Flag : 𝒰 where
guessed deduced : Flag
Trail : Ctx → 𝒰
Trail Γ = List (Lit Γ × Flag)
trail-lits : Trail Γ → List (Lit Γ)
trail-lits = map fst
trail→answer : Trail Γ → Answer
trail→answer =
fold-r emp λ (l , _) → upd (var l) (positive l)unit-subpropagate-loop : (CNF Γ → Trail Γ → CNF Γ × Trail Γ)
→ CNF Γ → Trail Γ → CNF Γ × Trail Γ
unit-subpropagate-loop rec cnf tr =
let cnf' = map (filter (not ∘ trail-has tr ∘ negate)) cnf
newunits = literals (filter (is-fresh-unit-clause tr) cnf')
in
if null newunits
then (cnf' , tr)
else loop cnf' (map (λ l → l , deduced) newunits ++ tr)
We get rid of the pure literal rule but perform unit propagation in batches:
backtrack : Trail Γ → Maybe (Lit Γ × Trail Γ)
backtrack [] = nothing
backtrack ((_ , deduced) ∷ ts) = backtrack ts
backtrack ((p , guessed) ∷ ts) = just (p , ts)
dpli-loop : CNF Γ
→ (Trail Γ → Maybe Answer)
→ Trail Γ → Maybe Answer
dpli-loop cnf rec tr =
let (cnf' , tr') = unit-subpropagate cnf tr in
if has [] cnf' then -- reached False
maybe
nothing
(λ (p , trb) → rec ((negate p , deduced) ∷ trb))
(backtrack tr)
else
...Then we either run into an inconsistency and have to backtrack:
dpli-loop : CNF Γ
→ (Trail Γ → Maybe Answer)
→ Trail Γ → Maybe Answer
dpli-loop cnf rec tr =
let (cnf' , tr') = unit-subpropagate cnf tr in
if has [] cnf' then
...
else -- need to guess
let ps = unassigned cls tr' in
if null ps
then just (trail→answer tr')
else rec ((splitting-rule' cls ps , guessed) ∷ tr)Or we have to make a choice:
For unit propagation, the measure is the count of literals still unused in the trail:
(x2 because of polarity)
y = 2 · size Γ ∸ length trHowever this, only works when the trail is unique (invariant #1)
For the guess case, the unused trail literals also work:
We take the old trail and add a new literal onto it, exhausting unused ones.
But what about the backtracking case?
Here trb is a suffix of the old trail - it shrinks!
We had the same situation in unification: two kinds of recursive calls, one decreases a measure, the other one increases.
Looks like we need to use the lexicographic product again, but with what?
rec ((splitting-rule' cls ps , guessed) ∷ tr)rec ((negate p , deduced) ∷ trb)If we look closely at this tree, we notice that
we never return to discarded branches.
So what sticks around is rejected literals.
The full measure is then a lexicographic product of a vector (N-ary product) of non-rejected assignments corresponding to the guessed level and the number of unused variables in the trace!
(whew)
For all of this to work, we need the old trail uniqueness invariant and a new one:
The rejected vector/stack also needs an invariant:
If a variable is on level n, its negation appears in the trail after dropping the first n guessed variables.
Just need to prove that all of these are preserved :/
Uniq (trail-lits tr) ×
(∀ x → (x , guessed) ∈ tr
→ negate x ∉ tail-of x (trail-lits tr))
∀ x (f : Fin (size Γ))
→ x ∈ lookup rj f
→ negate x ∈ (trail-lits $ drop-guessed tr (count-guessed tr ∸ fin→ℕ f))
DPLI-ty : {Γ : Ctx} → Vec ℕ (size Γ) × ℕ → 𝒰
DPLI-ty {Γ} (x , y) =
(tr : Trail Γ)
→ Trail-invariant tr
→ (rj : Rejectstack Γ)
→ Rejectstack-invariant rj tr
→ x = map (λ q → 2 · size Γ ∸ size q) rj
→ y = 2 · size Γ ∸ length tr
→ Maybe Answer
Cook, Podelski, Rybalchenko, [2011] "Proving program termination"
Hoder, Voronkov, [2009] "Comparing unification algorithms in first-order theorem proving"
Vardi, [2015] "The SAT Revolution: Solving, Sampling, and Counting"
Harrison, [2009] "Handbook of Practical Logic and Automated Reasoning"
Partially funded by the European Union (GA 101039196). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the European Research Council can be held responsible for them.