Applied Measure Theory

for Probabilistic Modeling

Chad Scherrer

July 2021

Introduction: Post-Covid Travel Planning

Choose a destination "randomly"
Your choice of map matters!
One perspective:
- Transform the "dart distribution" to a distribution on the globe
Our perspective for today:
- Transform "uniform on the globe" to a measure on a map
- Consider our "dart distribution" using this as a base measure
Work with measures in terms of relative densities

Toy Problem: Approximating Beta(1.5,4)

Find a Normal distribution to approximate p = Beta(1.5, 4)

Standard Normal

density(Normal(), x)

logdensity(::Normal{()} , x) = - x^2 / 2

Standard Normal

density(Normal(), x)

logdensity(::Normal{()} , x) = - x^2 / 2
basemeasure(::Normal{()}) = (1/sqrt2π) * Lebesgue(ℝ)

density(Normal(), Lebesgue(ℝ), x)

A Different Parameterization

A measure can have multiple parameterizations

Here (μ, logσ) allows use of parameters from ℝ²

q(θ) = Normal(μ=θ[1], logσ=θ[2])

function logdensity(d::Normal{(:μ,:logσ)}, x)
    μ, logσ = d.μ, d.logσ
    return -logσ - 0.5(exp(-2logσ) * (x - μ) ^ 2)
end

Computing the KL Divergence

D_\text{KL}(p || q) = \mathbb{E}_p[\log p - \log q]

p = Beta(1.5, 4)

q(θ) = Normal(μ=θ[1], logσ=θ[2])

logdensity(p, q, x)

\underbrace{\hspace{1in}}

Minimizing the KL Divergence

julia> using Symbolics; @variables μ logσ x;

Minimizing the KL Divergence

julia> using Symbolics; @variables μ logσ x;

julia> ℓ = logdensity(p, q([μ,logσ]), x)
3.2 + logσ + 0.5log(x) + 3log(1 - x) + 0.5exp(-2logσ)*((x - μ)^2)

Minimizing the KL Divergence

julia> using Symbolics; @variables μ logσ x;

julia> ℓ = logdensity(p, q([μ,logσ]), x)
3.2 + logσ + 0.5log(x) + 3log(1 - x) + 0.5exp(-2logσ)*((x - μ)^2)

julia> Symbolics.derivative(ℓ,μ)
0.5exp(-2logσ)*(2μ - (2x))

\mu = \mathbb{E}_p[x] \approx 0.27

Minimizing the KL Divergence

julia> using Symbolics; @variables μ logσ x;

julia> ℓ = logdensity(p, q([μ,logσ]), x)
3.2 + logσ + 0.5log(x) + 3log(1 - x) + 0.5exp(-2logσ)*((x - μ)^2)

julia> Symbolics.derivative(ℓ,μ)
0.5exp(-2logσ)*(2μ - (2x))

julia> Symbolics.derivative(ℓ,logσ)
1 - (exp(-2logσ)*((x - μ)^2))

\mu = \mathbb{E}_p[x] \approx 0.27

\sigma^2 = \mathbb{V}_p[x] \approx 0.03

Parameterized Measures

Ways of writing Normal(0,2)

-\log {\color{darkorange} \sigma} - \frac{1}{2}\left(\frac{x - \color{blue} \mu}{\color{darkorange} \sigma}\right)^2

Normal(0,2)
Normal(μ=0, σ=2)
Normal(σ=2)
Normal(mean=0, std=2)
Normal(mu=0, sigma=2)

-\frac{1}{2} \left( \log {\color{darkorange} \sigma^2} - \frac{(x - {\color{blue} \mu})^2}{\color{darkorange} \sigma^2} \right)

Normal(μ=0, σ²=4)
Normal(mean=0, var=4)

\frac{1}{2} \left( \log({\color{darkorange} τ}) - {\color{darkorange} τ} (x - {\color{blue} μ})^2 \right)

Normal(μ=0, τ=0.25)

-{\color{darkorange} \log \sigma} - \frac{(x - {\color{blue} μ})^2}{2 e^{2 {\color{darkorange} \log \sigma}}}

Normal(μ=0, logσ=0.69)

Computing Relative Log-Density

\text{Lebesgue}(\mathbb{R})

\text{Beta}(\alpha, \beta)

\frac{1}{\sqrt{2\pi}}\text{Lebesgue}(\mathbb{R})

\text{Normal}(\mu, \sigma^2)

\text{Lebesgue}(\mathbb{I})

—

IID Products

d = Beta(2,4) ^ (40,64)

A PowerMeasure produces replicates a given measure over some shape.

⋆

⋆Independent and Identically Distributed

Products with Index Dependence

d = For(40,64) do i,j
    Beta(i,j)
end

For(indices) do j
    # maybe more computations
    # ...
    some_measure(j)
end

For produces independent samples with varying parameters.

Markov Chains

mc = Chain(Normal(μ=0.0)) do x Normal(μ=x) end
r = rand(mc)

Define a new chain, take a sample

julia> take(r,100) == take(r,100)
true

This returns a deterministic iterator

julia> logdensity(mc, take(r, 1000))
-517.0515965372

Evaluate on any finite subsequence

Symbolic Evaluations

julia> using MeasureTheory, Symbolics

julia> @variables μ τ
2-element Vector{Num}:
 μ
 τ

julia> d = Normal(μ=μ, τ=τ) ^ 1000;

julia> x = randn(1000);

julia> ℓ = logdensity(d, x) |> expand
500.0log(τ) + 3.81μ*τ - (503.81τ) - (500.0τ*(μ^2))

Types and functions are generic, so symbolic manipulations work out of the box
Compare
- MeasureTheory.jl
- Distributions.jl

julia> logdensity(Distributions.Normal(μ, 1 / √τ), 2.0)
ERROR: MethodError: no method matching logdensity(::Num, ::Float64)

Working with Likelihoods

prior = HalfNormal()

\begin{aligned} \color{#009cfa} \sigma &\color{#009cfa}\sim \text{Normal}_+(0,1) \\ \phantom{\color{#e47045} x_n} &\phantom{\color{#e47045} \sim \text{Normal}(0,\sigma} \end{aligned}

Working with Likelihoods

prior = HalfNormal()

d = Normal(σ=2.0) ^ 10
lik = Likelihood(d, x)

\begin{aligned} \color{#009cfa} \sigma &\color{#009cfa}\sim \text{Normal}_+(0,1) \\ \color{#e47045} x_n &\color{#e47045} \sim \text{Normal}(0,\sigma) \end{aligned}

Working with Likelihoods

prior = HalfNormal()

d = Normal(σ=2.0) ^ 10
lik = Likelihood(d, x)

post = prior ⊙ lik

\begin{aligned} \color{#009cfa} \sigma &\color{#009cfa}\sim \text{Normal}_+(0,1) \\ \color{#e47045} x_n &\color{#e47045} \sim \text{Normal}(0,\sigma) \end{aligned}

{\color{#3ba64c} P(\sigma | x)} \propto {\color{#009cfa} P(\sigma)} {\color{#e47045} P(x | \sigma)}

Packages Using MeasureTheory.jl

From Moritz Schauer
- Mitosis.jl
- MitosisStochasticDiffEq.jl
- ZigZagBoomerang.jl can use MeasureTheory for sparse posteriors
From me
- In Soss.jl, every Model is also an AbstractMeasure

Funding

Thanks to PlantingSpace for funding for Spring 2021 https://planting.space

Thank You!

https://github.com/cscherrer/MeasureTheory.jl

https://informativeprior.com/