Scientific Computing

in Lean

Tomáš Skřivan

Carnegie Mellon University

Hoskinson Center for Formal Mathematics

9.5.2024

Why Lean

for Scientific Computing?

The language of science is mathematics

... and Lean speaks it!

Something interesting has to come out of it!

m \ddot x_i = \sum_j G \frac{m_i m_j}{r_{ij}^2}\hat r_{ij}

\begin{align*} v_i^{n+1} &= v_i^n + \Delta t \, \sum_j F(x_i^n,x_j^n) \\ x_i^{n+1} &= x_i^n + \Delta t \, v_i^{n+1} \end{align*}

def update (x v : Array Vec3) :=
  v := v + Δt * force x
  x := x + Δt * v
  (x,v)

Why Lean?

mathematical abstraction
- generality, composability
- seamless mix of symbolic and numerical methods
interactivity and (ab)use of tactic mode
- computer algebra system/source code transformations
- specification to implementation
- code optimization
extensible syntax
formal correctness

Examples demonstrating

usefulness of Lean

symbolic/automatic differentiation
- general framework for function transformation
synthesizing approximations
- specification to implementation
array expression optimization
- interactive code optimization
probabilistic programs

Symbolic and automatic

differentiation

def foo (x : R) : R := x^2

def foo_deriv (x : R) : R := 2*x

automatic differentiation (AD) synthesizes derivatives of a program

application in machine learning, optimization, and scientific computing

Symbolic and automatic

differentiation

def foo (x : R) : R := x^2

noncomputable
def foo_deriv (x : R) : R := fderiv R (fun x' => foo x') x 1

noncomputable
def fderiv (f : X → Y) (x : X) : X →L[𝕜] Y :=
  if h : ∃ (f' : X →L[𝕜] Y), (fun x' => f x' - f x - f' (x' - x)) =o[𝓝 x] fun x' => x' - x then
    choose h
  else 
  	0

definition of derivative in mathlib

using mathlib derivative

def foo_deriv (x : R) : R := (fderiv R (fun x' => foo x') x 1) rewrite_by fun_trans

fun_trans is a general tactic for function transformation

General function transformation

setting up fderiv as function transformation
definition:

@[fun_trans]
noncomputable
def fderiv (f : X → Y) (x : X) : X →L[𝕜] Y := ...

@[fun_trans]
theorem fderiv_add :
    (fderiv K fun x => f x + g x)
    =
    fun x => fderiv K f x + fderiv K g x := ...

@[fun_trans]
theorem fderiv_comp :
    fderiv 𝕜 (fun x => (f (g x))) 
    = 
    fun x => fun dx =>L[K] fderiv K f (g x) (fderiv K g x dx) := ...

@[fun_trans]
theorem fderiv_id :
    (fderiv K fun x : X => x) = fun _ => fun dx =>L[K] dx := ...

transformation rules:

code demo

Synthesizing approximations

noncomputable
def sqrt (y : R) : R := (fun x => x^2)⁻¹ y

def mySqrt (y : R) : Approx P (sqrt y) := ...

many operations line solving system of equations, integrals, solving ODEs etc. can not be computed exactly
we need a way to construct approximations
example of constructing approximation of sqrt

#check mySqrt.is_approx  -- "∀ y, sqrt y = limit p, mySqrt p y"

#check P -- Filter ApproxParam

#eval mySqrt {absTol := 1e-9} 2.0  -- 1.414214
#eval mySqrt {maxSteps := 1}  2.0  -- 1.5

ideally Approx provides theorem

code demo

Array expression optimization

custom interactive optimizations
example on basic linear algebra operation saxpy

def saxpy (a : Float) (x y : Float^[n]) := a•x+y

def saxpy_optimized (a : Float) (x y : Float^[n]) := x.mapIdx (fun i xi => a*xi + y[i])

def_optimize saxpy by
  simp only [HAdd.hAdd,Add.add,HSMul.hSMul,SMul.smul]
  simp only [mapMono_mapIdxMono]

semi-automatic with def_optimize macro

code demo

Probabilistic programming

structure Rand (X : Type) [MeasurableSpace X] where
  μ : Erased (Measure X)
  is_prob : IsProbabilityMeasure μ.out
  rand : RandG StdGen X

definition of random variable

for x : Rand X
- x.μ and x.is_prob is used for static analysis
- x.rand is pseudo random generator used by the compiler to emit executable code
example of probabilistic program

def test (θ : R) : Rand R := do
  let b ← flip θ
  if b then
    pure 0
  else
    pure (-θ/2)

Solving Laplace equation and

Walk on Spheres

\begin{align*} \Delta u(x) &= 0 \qquad &x \in \Omega \\ u(x) &= g(x) \qquad &x \in \partial \Omega \end{align*}

Laplace equation

mean value property

\begin{align*} u(x) &= \frac{1}{4\pi r^2} \int_{S(x,r)} u(y) \,dy \qquad B(x,r) \subset \Omega \end{align*}

recursive definition

\begin{align*} u_n(x) &= \frac{1}{4\pi r_n^2} \int_{S(x,r_n)} u_{n-1}(y) \,dy \\ r_n &= \text{dist}(x_n,\partial \Omega) \\ u_0(x) &= g(x) \end{align*}

apply Monte Carlo method

\frac{1}{4\pi r_n^2} \int_{S(x,r_n)} u_{n-1}(y) \,dy = \mathbb{E}_{y\sim P}[u_{n-1}(y)]

u(x) = u_\infty(x)

Mathlib limitations

many standard mathematical notions are to rigid for reasoning about programs
reasoning about programs needs easy to use calculus and not strong existence guarantees

mathematics

programs

Banach spaces
manifolds
strong/Bochner integral
measurable spaces

Convenient vector spaces
diffeology
weak/Pettis integral
quasi-Borel spaces

important properties

completeness

compactness

better behavior of function spaces e.g. cartesian closed category

SciLean

website: https://github.com/lecopivo/SciLean
work in progress book/documentation: https://github.com/lecopivo/scientific-computing-lean/
- only first chapter on using arrays
design principles of SciLean
1. usability, composability and generality
2. performance
3. formal correctness
Lean is an amazing new language and many of its features designed for theorem proving offer a new and exciting design space for a scientific computing language