# Conditioned diffusions in geometric statistics

Means, bridges, and shape variation along phylogenetic trees

Stefan Sommer, University of Copenhagen

Faculty of Science, University of Copenhagen

INRIA Sophia-Antipolis, 2023

# shapes - geometric statistics - diffusion means - phylogenetics

w/ Sarang Joshi, Frank v.d. Meulen, Moritz Schauer, Benjamin Eltzner, Stephan Huckemann, Mathias H. Jensen, Pernille E.H. Hansen, Mads Nielsen, Rasmus Nielsen, Christy Hipsley, Sofia Stoustrup

Villum foundation

Novo nordisk foundation

University of Copenhagen

# Deformations and shape

E_{s_0,s_1}(\phi)=R(\phi)+\frac1\lambda S(\phi.s_0,s_1)

action: $$\phi.s=\phi\circ s$$         (shapes)
$$\phi.s=s\circ\phi^{-1}$$     (images)

$$\phi$$

$$\phi$$ warp of domain $$\Omega$$ (2D or 3D space)

landmarks: $$s=(x_1,\ldots,x_n)$$

curves: $$s: \mathbb S^1\to\mathbb R^2$$

surfaces: $$s: \mathbb S^2\to\mathbb R^3$$

s_0
s_1

# Riemannian view

R(\phi_t)=\int_0^T\|\partial_t \phi_t\|_{\phi_t}^2dt

$$\phi_t:[0,T]\to\mathrm{Diff}(\Omega)$$ path of diffeomorphisms (parameter t)

\mathrm{Diff}(\Omega)
\mathrm{Id}_{\mathrm{Diff}(\Omega)}
\phi_t

LDDMM: Grenander, Miller, Trouve, Younes, Christensen, Joshi, et al.

\partial_t \phi_t
\phi

# Evolution with noise

\partial_t \phi_t = F(\phi_t)\ \to\ d\phi_t=F(\phi_t)dt\color{blue}{+\sigma(\phi_t) dW_t}
\mathrm{Diff}(\Omega)
\mathrm{Id}_{\mathrm{Diff}(\Omega)}
\phi_t

Markussen,CVIU'07; Budhiraja,Dupuis,Maroulas,Bernoulli'10
Trouve,Vialard,QAM'12;Vialard,SPA'13;Marsland/Shardlow,SIIMS'17
Arnaudon,Holm,Sommer,IPMI'17; FoCM'18; JMIV'19
Arnaudon,v.d. Meulen,Schauer,Sommer'21

geodesic ODE

perturbed SDE

# Geometric statistics

Statistics of geometric data:

- plane directions:      $$\mathbb{S}^1$$

- geographical data:  $$\mathbb{S}^2$$

- 3D directions:           $$\mathrm{SO}(3), \mathbb{S}^2$$

- angles:                       $$\mathbb{T}^N$$

- shapes

# Least-squares $$\leftrightarrow$$ probabilistic

Deterministic:

• $$\phi_t$$ geodesic evolution
• square distances:
$$\quad d(s_0,s_1)^2$$
• Riemannian least-
squares

Stochastic:

• $$\phi_t$$ stochastic process
• (log) transition density
$$\quad -\log p_T(s_1; s_0)$$
• ML/MAP
• bridge:
$$\quad \phi_t|\phi_T.s_0=s_1$$
• bridge + noise in observation:
$$\quad \phi_t|\phi_T.s_0+\epsilon=s_1$$
• parametric families of probability distributions $$\mu_\theta$$
• likelihood from density:
$$\quad\mathcal{L}(\theta; y_1,\ldots,y_N)=\prod_{i=1}^Np_\theta(y_i)$$
• ML/MAP estimates:
$$\quad\bar{\theta}=\mathrm{argmax}_\theta\mathcal{L}(\theta; y_1,\ldots,y_N)$$
• Diffusion mean:
$$\quad x_t\in M$$ Brownian motion
$$\quad\theta=x_0$$
• assume $$y\sim x_T$$:
$$\quad\bar{x}_{\mathrm{diffusion}}=\mathrm{argmax}_\theta\mathcal{L}(\theta)$$

Generalization of Euclidean statistical notions and techniques.

• i.i.d. samples $$y_1,\ldots,y_N\in M$$
• Fréchet mean:
$$\bar{x}=\mathrm{argmin}_{x\in M}\sum_{i=1}^Nd(x,y_i)^2$$

Nye, White, JMIV'14;
Sommer,IPMI'15; Sommer,Svane,JGM'15;
Hansen,Eltzner,Huckemann,Sommer,GSI'21,Bernoulli'23

M

# Uniqueness and asymptotics

Hotz,Huckemann'11; Le,Barden'14

Eltzner,Huckeman'19; Hansen,Eltzner,Huckemann,Sommer'23

# Estimation: Simulation of Conditioned Semimartingales on Riemannian Manifolds

Jensen, Mallasto, Sommer 2019 ; Jensen, Sommer 2021, 2022

# Guided bridges

dx_t = b(t,x_t)dt +\sigma(t,x_t)dW_t

Delyon/Hu 2006:

$$\sigma$$ invertible:

• guided bridge proposal$$dy_t = b(t,y_t)dt - \frac{y_t-v}{T-t}dt + \sigma(t,y_t)dW_t$$
• $$y_T=v$$ a.s.
• $$x_t|x_T=v$$ absolute continuous wrt. $$y_t$$
• $$\mathbb E_{x_t|x_T=v}[f(x_t)]\propto \mathbb E_{y_t}[f(y_t)\varphi(y_t)]$$

$$v$$

$$x_0$$

$$x_t$$

# Bridges on Lie groups and homogenous spaces

• $$A$$ quadratic form on $$so(3)$$
• $$x_t\in SO(3)$$ Brownian motion
• $$\theta=(x_0,A)$$
• $$(\bar{x},\bar{A})=\mathrm{argmax}_\theta\mathcal{L}(\theta)$$

$$\pi$$

Thompson'16, Sommer,Joshi,Højgaard,'22

# Stochastic morphometry along phylogenies

- Rules of morphological change

- Drivers of morphological change (ecology, historical contingency)

- Mechanisms of morphological change (genetic basis)

# Shapes in phylogenetics

1. forward probabilistic model
2. tree pruning for shapes
3. MCMC / variational inference:
1. likelihoods
2. parameter estimation
3. gene/character covariance
4. interpolation
5. hypothesis testing
6. tree inference

# Felsenstein's pruning algorithm for shapes

Brown. motion

Brown. motion

Brown. motion

Brown. motion

branch (independent children)

incorporate leaf observations $$x_{V_T}$$ into probabilistic model:
$$p(X_t|x_{V_T})$$

Doob’s h-transform

$$h_s(x)=\prod_{t\in\mathrm{ch(s)}}h_{s\to t}(x)$$

conditioned process $$X^*_t$$

approximations $$\tilde{h}$$

guided process $$X^\circ_t$$

# Stochastically evolving shapes

shape $$s_0$$

shape $$s_1$$

stoch. evolution $$s_0\rightarrow s_1$$

dx_t= -\frac12g(x_t)^{kl}\Gamma(x_t)_{kl}dt + \sqrt{g(x_t)^*}dW_t

Riemannian Brownian motion:

$$\phi_t$$

# Eulerian shape process

Shape process:

$dX_t=K(X_t)\circ dW_t$

Kernel matrix:

$K(X_t)^i_j=k(x_i,x_j)$

$$X_t$$ landmarks at time $$t$$:

$X_t=\begin{pmatrix}x_{1,t}\\y_{1,t}\\\vdots\\x_{n,t}\\y_{n,t}\end{pmatrix}$

$$X_0$$

$$t=\frac12$$

$$t=3$$

# Conditioned shape process

Conditioning on hitting target $$v$$ at time $$T>0$$:

$X_t|X_T=v$

Ito stochastic process:

$dx_t=b(t,x_t)dt\qquad\qquad\qquad\qquad\quad\\+\sigma(t,x_t)dW_t$

Bridge:

$dx^*_t=b(t,x^*_t)dt+a(t,x^*_t)\nabla_x\log \rho_t(x^*_t)dt\\+\sigma(t,x^*_t)dW_t$

Score $$\nabla_x\log \rho_t$$ intractable....

$\rho_t(x)=p_{T-t}(v;x)$

$a(t,x)=\sigma(t,x)\sigma(t,x)^T$

black: $$X_0$$, red: $$v$$

# Approximate bridges

Auxilary process:

$d\tilde{x}_t=\tilde{b}(t,\tilde{x}_t)dt+\tilde{\sigma}(t,\tilde{x}_t)dW_t$

Approximate bridge:

$d\tilde{x}_t=\tilde{x}(t,\tilde{x}_t)dt+\tilde{a}(t,\tilde{x}_t)\nabla_x\log \tilde{\rho}_t(\tilde{x})dt\\+\tilde{\sigma}(t,\tilde{x}_t)dW_t$

E.g. linear process, score $$\nabla_x\log \tilde{\rho}_t$$ is known in closed from

(almost) explicitly computable likelihood ratio:

$\frac{d\mathbb P^*}{d\tilde{\mathbb P}}=\frac{\tilde{\rho}_T(v)}{\rho_T(v)}\Psi(\tilde{x}_t)$

van der Meulen, Schauer et al.

Ito stochastic process:

$dx_t=b(t,x_t)dt+\sigma(t,x_t)dW_t$

Bridge process:

$dx^*_t=b(t,x^*_t)dt+a(t,x^*_t)\nabla_x\log \rho_t(x^*_t)dt\\+\sigma(t,x^*_t)dW_t$

Score $$\nabla_x\log \rho_t$$ intractable....

# v.d. Meulen/Schauer bridges

v.d. Meulen,Schauer,Arnaudon,Sommer,SIIMS'22

# From single edges to trees

Bridge:

Leaf conditioning:

$$x_0$$

$$v$$

$$x_0$$

$$h$$

$$v_1$$

van der Meulen, Schauer'20; van der Meulen'22
Stoustrup, Nielsen, van der Meulen, Sommer

$$v_2$$

recursive,leaves to root

Backwards filter:

root to leaves

Forward guiding:

$$v$$

$$v_1$$

$$v_2$$

$$h$$

$$x_0$$

tree

backwards filtering

forwards guiding

# MCMC

v.d. Meulen,Schauer,Arnaudon,Sommer,SIIMS'22

# Geometry, stochastics, geometric statistics

code: http://bitbucket.com/stefansommer/jaxgeometry                          Centre for Computational Evolutionary Morphometry: http://www.ccem.dk

slides: https://slides.com/stefansommer                                                     Stochastic Morphometry: https://www.ccem.dk/stochastic-morphometry/

References:

• Philipp Harms, Peter W. Michor, Xavier Pennec, Stefan Sommer: Geometry of sample spaces, Diff. Geom. and its Appl., 2023, arXiv:2010.08039
• Hansen, Eltzner, Huckemann, Sommer: Diffusion Means in Geometric Spaces, Bernoulli, 2023, arXiv:2105.12061
• Grong, Sommer: Most probable paths for anisotropic Brownian motions on manifolds, FoCM 2022, arXiv:2110.15634
• Højgaard, Joshi, Sommer: Discrete-Time Observations of Brownian Motion on Lie Groups and Homogeneous Spaces: Sampling and Metric Estimation, Algorithms, 2022,
• Jensen, Sommer: Mean Estimation on the Diagonal of Product Manifolds, Algorithms, 2022, https://www.mdpi.com/1999-4893/15/3/92
• Arnaudon, v.d. Meulen, Schauer, Sommer: Diffusion bridges for stochastic Hamiltonian systems and shape evolutions,SIIMS,2022,arXiv:2002.00885
• Hansen, Eltzner, Sommer: Diffusion Means and Heat Kernel on Manifolds, 2021, GSI 2021, arXiv:2103.00588.
• Højgaard Jensen, Sommer: Simulation of Conditioned Diffusions on Riemannian Manifolds, 2021, arXiv:2105.13190.
• Sommer, Bronstein: Horizontal Flows and Manifold Stochastics in Geometric Deep Learning, TPAMI, 2020, doi: 10.1109/TPAMI.2020.2994507
• Arnaudon, Holm, Sommer: A Geometric Framework for Stochastic Shape Analysis, Foundations of Computational Mathematics, 2019, arXiv:1703.09971.
• Højgaard Jensen, Mallasto, Sommer: Simulation of Conditioned Diffusions on the Flat Torus, GSI 2019., arXiv:1906.09813.
• Sommer, Svane: Modelling Anisotropic Covariance using Stochastic Development and Sub-Riemannian Frame Bundle Geometry, JoGM, 2017, arXiv:1512.08544.
• Sommer: Anisotropically Weighted and Nonholonomically Constrained Evolutions, Entropy, 2017, arXiv:1609.00395 .
• Sommer, Svane: Modelling Anisotropic Covariance using Stochastic Development and Sub-Riemannian Frame Bundle Geometry, JoGM, 2017, arXiv:1512.08544.
• Sommer: Anisotropically Weighted and Nonholonomically Constrained Evolutions, Entropy, 2017, arXiv:1609.00395 .
• Arnaudon, Holm, Sommer: A Stochastic Large Deformation Model for Computational Anatomy, IPMI 2017, arXiv:1612.05323.
• Sommer: Anisotropic Distributions on Manifolds: Template Estimation and Most Probable Paths, IPMI 2015,

By Stefan Sommer

• 70