Substitution

\(\lambda\)-calculus

2020 James B. Wilson

Colorado State University

Recall: Variables are symbols we can replace

  • Fix a context, e.g. arithmetic, geometry.
  • The language has an alphabet, e.g. arithmetic \(+,-,0,1,\times\), geometry \(\bot, \|,\angle\)
  • Everything else is a variable, e.g. arithmetic \(m,n,a,b,c,x\), geometry \(P,Q, \ell,p\)

Check point

Technically variables are not assigned values, i.e. \(x=3\) is meaningless as \(x\) would cease to be a variable.

 

So when we write:

If \(x=3\) then \(y=x^2=9\)

We mean:

Substitute 3 for \(x\) in \(y=x^2\) yields \(9\).

The Usual Story

 

Set Theory tells us:

A function \(f:X \to Y\) is a set \(f=\{(x,y)\mid x\in X, y\in Y\}\) where

  1. \(X\) is the domain: \((\forall x\in X)(\exists y\in Y)((x,y)\in f)\), and
  2. \(f\) is well-defined: \((x,y),(x,y')\in f \Rightarrow y=y'.\)

To replace variables is to evaluate a function.

 

Main Problem

  • Want to substitute not only numbers but +, =, etcetera.
  • What is the domain of all things "+"?  Is it even a set? (NO! Read about Russell's Paradox here.)
  • We need consistent substitution without sets!

Consistant Substitution

  • One variable at a time (\(\lambda\)-abstraction).
  • Rename variables (\(\alpha\)-conversion).
  • Substitute into variables (\(\beta\)-reduction).
  • Name functions (\(\eta\)-abstraction).

The funny names are historical and while not memorable, they are still used today in logic and informatics.

Anonymous Functions

"\(\lambda\)'s"

(Scroll Down)

 

Goal: substitute variable in \(M\) consistently.

Rule: one variable at a time, denote it:

\[x\mapsto M\]

This is an anonymous-function, also called a "\(\lambda\)".​

 

Vocabulary:

the variable \(x\) is bound in \(x\mapsto M\), other variables are free.

Examples in the language of arithmetic

In \(x\mapsto 2\cdot x+3\cdot y\),  \(x\) is bound and \(y\) is free.

Note: Programming languages use notation like x:->2*x+3*y and lambda x : 2*x+3*y

 

In \(y\mapsto 2\cdot x+3\cdot y\),  \(x\) is free and \(y\) is bound.

 

 

In \(y\mapsto (x\mapsto 2\cdot x+3\cdot y)\),  both \(x\) and \(y\) are bound variables.

 

 

In \(2\cdot x+3\cdot y\) both \(x\) and \(y\) are free, this is also not an anonymous-function.

Higher-order Examples

In \(+\mapsto 2\cdot x+3\cdot y\),  \(+\) is bound and \(2,3,x,y\) are free.

 

Use this to replace addition with an appropriate operation, say the program that adds polynomials or matrices.

Higher-order Examples

In \(0\mapsto x^2+1=0\),  \(0\) is bound, all else is free.

Use this to update \(0\) to whatever one we need from context, e.g. the zero matrix or the zero polynomial.

Higher-order Examples

 

In \(=~\mapsto x^2+1=0\),  \(=\) is bound, this would allow us to replace "=" with say congruence mod 10.

 

Church in 1930 introduced this view of functions as built from simple concepts.  Instead of \(x\mapsto M\) he used

\[\lambda x.M.\]

That made it match other ways to bind variables like \(\forall x.M\) and \(\exists x.M\).

 

Arrow notation is popular in math because of a later invention by Eillenberg-Maclane called Category Theory.

 

Today anonymouns functions are stilled called "\(\lambda\)'s" in logic and informatics in honor of the origins.

History

Substitution 

"\(\beta\)-reduction"

(Scroll Down)

 

Given: an anonymous-function \(x\mapsto M\) and a term \(C\) 

Replace: \(x\) with \(C\), written

\[[x:=C](x\mapsto M)\quad\rhd\quad [x:=C]M\]

where \([x:=C]M\) is just a place-holder for the result of substituting \(C\) everywhere there is an \(x\).  This is called evaluation (or \(\beta\)-reduction).

Evaluation

Examples

 

\[[x:=5](x\mapsto 2\cdot x+3\cdot y) \qquad \rhd\qquad 2\cdot 5+3\cdot y.\]

 

\[[y:=7](y\mapsto 2\cdot x+3\cdot y) \qquad\rhd\qquad 2\cdot x+3\cdot 7.\]

 

 

\[[x:=5][y:=7](y\mapsto(x\mapsto 2\cdot x+3\cdot y))\qquad\rhd\qquad 2\cdot 5+3\cdot 7.\]

 

 

\(\rhd\) is a symbol for "yields", i.e. the result of our work. 

Higher-order Example

 

Define

 

 

 

Now

 

replaces the variable of addition with an actual addition, here the addition of \(2\times 2\)-matrices.

 

We do this so intuitively that we normally just use + instead of a new symbol like \(\boxplus\) and \(+_{\mathbb{R}}\), but these are different.

[+:=\boxplus](+\mapsto x+y) \quad\rhd\quad x \boxplus y
\begin{bmatrix}a & b\\ c& d\end{bmatrix}\boxplus \begin{bmatrix} s & t\\ u & v\end{bmatrix} := \begin{bmatrix} a+_{\mathbb{R}}s & b+_{\mathbb{R}}t\\ c+_{\mathbb{R}}u & d+_{\mathbb{R}}v\end{bmatrix}

Higher-order Example Continued

 

 

 

 

 

 

 

\begin{aligned} \left[x:=\left[\begin{smallmatrix}a & b \\ c & d\end{smallmatrix}\right]\right] & \left[y:=\left[\begin{smallmatrix}s & t \\ u & v\end{smallmatrix}\right]\right][+:=\boxplus](x\mapsto (y\mapsto (+\mapsto x+y))) \\ & \quad\rhd\quad \left[x:=\left[\begin{smallmatrix}a & b \\ c & d\end{smallmatrix}\right]\right]\left[y:=\left[\begin{smallmatrix}s & t \\ u & v\end{smallmatrix}\right]\right](x\mapsto(y\mapsto(x \boxplus y))\\ & \quad\rhd\quad \left[x:=\left[\begin{smallmatrix}a & b \\ c & d\end{smallmatrix}\right]\right]\left(x\mapsto \left(x \boxplus \left[\begin{smallmatrix}s & t \\ u & v\end{smallmatrix}\right]\right)\right)\\ & \quad\rhd\quad \begin{bmatrix}a & b \\ c & d\end{bmatrix}\boxplus \begin{bmatrix}s & t \\ u & v\end{bmatrix}\\ \end{aligned}

But in fact even that is not the end, we keep reducing by substituting in the value of \(\boxplus\) to get:

\rhd\quad \begin{bmatrix} a+_{\mathbb{R}}s & b+_{\mathbb{R}}t\\ c+_{\mathbb{R}}u & d+_{\mathbb{R}}v\end{bmatrix}

Rename Variables

"\(\alpha\)-conversion"

(Scroll Down)

Variable Capture Issue:

When \(x\) and \(a\) are different variables then 

\(x\mapsto a\) is a constant function, it always outputs \(a\).

 

Of course, we mean to eventually replace \(a\) with actual values, e.g. 5.  So we actually have

\[a\mapsto( x\mapsto a)\]

 

Replace \(a\) with 5, no problem:

\[[a:=5](A\mapsto (x\mapsto a))\quad \rhd\quad x\mapsto 5.\]

 

Variable Capture Issue:

 

Replace \(a\) with \(z\), again no problem:

\[[a:=z](a\mapsto (x\mapsto a))\quad \rhd\quad x\mapsto z.\]

 

Replace \(a\) with \(x\), again no problem:

\[[a:=x](a\mapsto (x\mapsto a))\quad \rhd\quad x\mapsto x.\]

MISTAKE: This is not a constant function, this is the identity function!

Substitution needs Some Rules

Basic problem, here \(x\) is used as free variable 

\[[a:=x](a\mapsto M)\]

But substituting \(x\mapsto a\) for \(M\) makes 

\[[a:=x](a\mapsto (x\mapsto a))\]

where now the free \(x\) is also bound inside.

 

No variables should stay as both free and bound, we call this confusing situation a "captured" variable.

 

We fix this by renaming one use of the variable.

Suppose that \(x\) is free in \(C\) and \(y\) is free in \(M\) then

\[[y:=C](x\mapsto M)\]

is defined to mean first rename the "captured" variable \(x\) with an unused variable \(z\), i.e.

 

\[[x:=z](x\mapsto M) \quad\rhd\quad [x:=z]M\]

 

Then substitute for \(y\)

\begin{aligned} [y:=C](x\mapsto M) & \rhd [y:=C](z\mapsto [x:=z]M)\\ & \rhd z\mapsto [x:=z,y:=C]M \end{aligned}

Example

\begin{aligned} [a:=x](x\mapsto a) & \rhd [a:=x](z\mapsto [x:=z]a)\\ & \rhd [a:=x](z\mapsto a)\\ & \rhd (z\mapsto x)\\ \end{aligned}

Assume \(x,a,z\) are distinct variables.

So by renaming the "captured" variable \(x\) we indeed keep \(x\mapsto a\) as a constant function.

Formal Substitution Rules of Curry-Feys

Identity Function:   \[[x:=C](x\mapsto x) \quad \rhd\quad C\]

E.g. \([x:=5](x\mapsto x) \quad\rhd\quad 5\)

Constant Function (but avoid variable capture): \[[x:=C](x\mapsto A)\quad\rhd\quad A\]

E.g. \([x:=5](x\mapsto 27y)\quad\rhd\quad 27y\)

Recursion: \[[x:=C](x\mapsto AB)\quad\rhd\quad [x:=C]A [x:=C]B\]

E.g.

\begin{aligned} [x:=5](x\mapsto 3\cdot x) & \rhd [x:=5](x\mapsto 3\cdot) [x:=5](x\mapsto x)\\ & \rhd\quad 3\cdot 5 \end{aligned}

(Optional Read) Full Details

  1. (Identity Result) \([x:=c]x\) stands for \(c\).
  2. (Constant Result) \([x:=c]a\) stands for \(a\), for every alphabet term or variable \(a\neq x\).
  3. (Recursive Result) \([x:=c](AB)\) is \([x:=c]A[x:=c]B\)

So far we have let \([x:=c]M\) be understood from informal words "replace all \(x\)'s with \(c\)'s".  Here is what that becomes in formal language:

Notation Rules for \([x:=c]M\)

(Optional Read) Full Details

Rule 1

\([x:=c](x\mapsto (x\mapsto A))\) stands for \(x\mapsto A\).

With the notation of \([x:=c]M\) we can give a formal description of substitution making variable capture completely automatic to detect and avoid

This our earlier "Constant" rule but notice, the inner most bound \(x\) wont be replaced as it is captured and so will be renamed to escape: \[[x:=c](x\mapsto (z\mapsto [x:=z]A))\] and thus output \(z\mapsto [x:=z]A\), which is \(\alpha\)-equivalent to \(x\mapsto A\).

(Optional Read) Full Details

Rule 2

\([x:=c](x\mapsto (y\mapsto A))\) stands for \(y\mapsto A\),

if \(x\) is not free in \(A\).

Again this is our earlier "Constant" rule, because adding that \(x\) is not free in \(A\) means \(y\mapsto A\) does not change with \(x\) so it is constant, so we get it back.

(Optional Read) Full Details

Rule 3

\([x:=c](x\mapsto (y\mapsto A))\) stands for \(y\mapsto [x:=c]A\),

if \(x\) is free in \(A\) but \(y\) is not free in \(c\).

Notice \(x\mapsto (y\mapsto A)\) behave just like \(y\mapsto (x\mapsto A)\).  Then the addition of \([x:=c]\) moves into the inner most function and we proceed by recursion. 

(Optional Read) Full Details

Rule 4

\([x:=c](x\mapsto (y\mapsto A))\) stands for \(z\mapsto [x:=c,y:=z]A\),

if \(x\) is free in \(A\) and \(y\) is free in \(c\).

This was our earlier captured variable avoidance strategy.

Naming Functions

"\(\eta\)-reduction"

(Scroll Down)

 

Issue: while anonymous functions are enough, it is bulky notation.

Rule: introduce new class of variables, called functions and use these to name anonymous functions

\[f:x\mapsto M\]

The act of replacing \(x\mapsto M\) with a function variable \(f\) is called naming or \(\eta\)-reduction.

 

Notation: \(f(C)\) stands for \([x:=C](x\mapsto M)\).

Naming Functions

Result

Functions without sets.

 No domains

No codomians

No well-defined axioms

No axioms at all

Further Reading

  • Online guide
  • Hindley-Seldin Lambda-Calculus and Combinators, Cambridge U. Press, Chapter 1.

Substitution: lambda-calculus

By James Wilson

Substitution: lambda-calculus

A brief introduction to lambda-calculus for students of algebra.

  • 535