Conditioned diffusions in geometric statistics

Means, bridges, and shape variation along phylogenetic trees

Stefan Sommer, University of Copenhagen

Faculty of Science, University of Copenhagen

AI Topology, 2024

shapes - geometric statistics - diffusion means - phylogenetics

w/ Sarang Joshi, Frank v.d. Meulen, Moritz Schauer, Benjamin Eltzner, Stephan Huckemann, Mathias H. Jensen, Pernille E.H. Hansen, Mads Nielsen, Rasmus Nielsen, Christy Hipsley, Sofia Stoustrup

Villum foundation

Novo nordisk foundation

University of Copenhagen

Geometric statistics

Statistics of geometric data:

- plane directions:      \(\mathbb{S}^1\)

- geographical data:  \(\mathbb{S}^2\)

- 3D directions:           \(\mathrm{SO}(3), \mathbb{S}^2\)

- angles:                       \(\mathbb{T}^N\)

- shapes

Statistical shape analysis

Least-squares \(\leftrightarrow\) probabilistic

Deterministic:

  • geodesics replacing vectors
  • square distances:
    \(\quad d(s_0,s_1)^2\)
  • Riemannian least-
    squares

Stochastic:

  • stochastic processes
  • (log) transition density
    \(\quad -\log p_T(s_1; s_0)\)
  • ML/MAP
  • bridge:
    \(\quad \phi_t|\phi_T.s_0=s_1\)
  • bridge + noise in observation:
    \(\quad \phi_t|\phi_T.s_0+\epsilon=s_1\)
  • parametric families of probability distributions \(\mu_\theta\)
  • likelihood from density:
    \(\quad\mathcal{L}(\theta; y_1,\ldots,y_N)=\prod_{i=1}^Np_\theta(y_i)\)
  • ML/MAP estimates:
    \(\quad\bar{\theta}=\mathrm{argmax}_\theta\mathcal{L}(\theta; y_1,\ldots,y_N)\)
  • Diffusion mean:
    \(\quad x_t\in M\) Brownian motion
    \(\quad\theta=x_0\)
  • assume \(y\sim x_T\):
    \(\quad\bar{x}_{\mathrm{diffusion}}=\mathrm{argmax}_\theta\mathcal{L}(\theta)\)

 

 

Generalization of Euclidean statistical notions and techniques.

  • i.i.d. samples \(y_1,\ldots,y_N\in M\)
  • Fréchet mean:
    \(\bar{x}=\mathrm{argmin}_{x\in M}\sum_{i=1}^Nd(x,y_i)^2\)

Nye, White, JMIV'14;
Sommer,IPMI'15; Sommer,Svane,JGM'15;
Hansen,Eltzner,Huckemann,Sommer,GSI'21,Bernoulli'23

Means in geometric statistics

M

Uniqueness and asymptotics

Hotz,Huckemann'11; Le,Barden'14

Eltzner,Huckeman'19; Hansen,Eltzner,Huckemann,Sommer'23

Estimation: Simulation of Conditioned Semimartingales on Riemannian Manifolds

Jensen, Mallasto, Sommer 2019 ; Jensen, Sommer 2021, 2022

Guided bridges

dx_t = b(t,x_t)dt +\sigma(t,x_t)dW_t

Delyon/Hu 2006:

\(\sigma\) invertible:

  • guided bridge proposal$$dy_t = b(t,y_t)dt - \frac{y_t-v}{T-t}dt + \sigma(t,y_t)dW_t$$
  • \(y_T=v\) a.s.
  • \(x_t|x_T=v\) absolute continuous wrt. \(y_t\)
  • \(\mathbb E_{x_t|x_T=v}[f(x_t)]\propto \mathbb E_{y_t}[f(y_t)\varphi(y_t)]\)

\(v\)

\(x_0\)

\(x_t\)

Simulation of Conditioned Semimartingales on Riemannian Manifolds

Heat kernel approximations

Bridges on Lie groups and homogenous spaces

  • \(A\) quadratic form on \(so(3)\)
  • \(x_t\in SO(3)\) Brownian motion
  • \(\theta=(x_0,A)\)
  • \((\bar{x},\bar{A})=\mathrm{argmax}_\theta\mathcal{L}(\theta)\)

\(\pi\)

Thompson'16, Sommer,Joshi,Højgaard,'22

Guided proposals on manifolds

Corstanje,van der Meulen,Schauer,Sommer'24

Guiding using the heat kernel and comparison manifolds

Stochastic morphometry along phylogenies

A return to morphology:

- Rules of morphological change

- Drivers of morphological change (ecology, historical contingency)

- Mechanisms of morphological change (genetic basis)

Deformations and shape

E_{s_0,s_1}(\phi)=R(\phi)+\frac1\lambda S(\phi.s_0,s_1)

action: \(\phi.s=\phi\circ s\)         (shapes)
             \(\phi.s=s\circ\phi^{-1}\)     (images)

\( \phi \)

\( \phi \) warp of domain \(\Omega\) (2D or 3D space)

landmarks: \(s=(x_1,\ldots,x_n)\)

curves: \(s: \mathbb S^1\to\mathbb R^2\)

surfaces: \(s: \mathbb S^2\to\mathbb R^3\)

s_0
s_1

Riemannian view

R(\phi_t)=\int_0^T\|\partial_t \phi_t\|_{\phi_t}^2dt

\( \phi_t:[0,T]\to\mathrm{Diff}(\Omega) \) path of diffeomorphisms (parameter t)

\mathrm{Diff}(\Omega)
\mathrm{Id}_{\mathrm{Diff}(\Omega)}
\phi_t

LDDMM: Grenander, Miller, Trouve, Younes, Christensen, Joshi, et al.

\partial_t \phi_t
\phi

Evolution with noise

\partial_t \phi_t = F(\phi_t)\ \to\ d\phi_t=F(\phi_t)dt\color{blue}{+\sigma(\phi_t) dW_t}
\mathrm{Diff}(\Omega)
\mathrm{Id}_{\mathrm{Diff}(\Omega)}
\phi_t

Markussen,CVIU'07; Budhiraja,Dupuis,Maroulas,Bernoulli'10
Trouve,Vialard,QAM'12;Vialard,SPA'13;Marsland/Shardlow,SIIMS'17
Arnaudon,Holm,Sommer,IPMI'17; FoCM'18; JMIV'19
Arnaudon,v.d. Meulen,Schauer,Sommer'21

geodesic ODE

perturbed SDE

Shapes in phylogenetics

  1. forward probabilistic model
  2. tree pruning for shapes
  3. MCMC / variational inference:
    1. likelihoods
    2. parameter estimation
    3. gene/character covariance
    4. interpolation
    5. hypothesis testing
    6. tree inference

Felsenstein's pruning algorithm for shapes

Brown. motion

Brown. motion

Brown. motion

Brown. motion

branch (independent children)

incorporate leaf observations \(x_{V_T}\) into probabilistic model:
\(p(X_t|x_{V_T})\)

Doob’s h-transform

\(h_s(x)=\prod_{t\in\mathrm{ch(s)}}h_{s\to t}(x)\)

conditioned process \(X^*_t\)

approximations \(\tilde{h}\)

guided process \(X^\circ_t\)

Stochastically evolving shapes

shape \(s_0\)

shape \(s_1\)

stoch. evolution \(s_0\rightarrow s_1\)

dx_t= -\frac12g(x_t)^{kl}\Gamma(x_t)_{kl}dt + \sqrt{g(x_t)^*}dW_t

Riemannian Brownian motion:

\( \phi_t \)

Eulerian shape process

Shape process:

\[dX_t=K(X_t)\circ dW_t\]

Kernel matrix:

\[K(X_t)^i_j=k(x_i,x_j)\]

\(X_t\) landmarks at time \(t\):

\[X_t=\begin{pmatrix}x_{1,t}\\y_{1,t}\\\vdots\\x_{n,t}\\y_{n,t}\end{pmatrix}\]

\(X_0\)

\(t=\frac12\)

\(t=3\)

Conditioned shape process

Conditioning on hitting target \(v\) at time \(T>0\):

\[X_t|X_T=v\]

 

Ito stochastic process:

\[dx_t=b(t,x_t)dt\qquad\qquad\qquad\qquad\quad\\+\sigma(t,x_t)dW_t\]

Bridge:

\[dx^*_t=b(t,x^*_t)dt+a(t,x^*_t)\nabla_x\log \rho_t(x^*_t)dt\\+\sigma(t,x^*_t)dW_t\]

 

Score \(\nabla_x\log \rho_t\) intractable....

\[\rho_t(x)=p_{T-t}(v;x)\]

\[a(t,x)=\sigma(t,x)\sigma(t,x)^T\]

black: \(X_0\), red: \(v\)

Approximate bridges

Auxilary process:

\[d\tilde{x}_t=\tilde{b}(t,\tilde{x}_t)dt+\tilde{\sigma}(t,\tilde{x}_t)dW_t\]

Approximate bridge:

\[d\tilde{x}_t=\tilde{x}(t,\tilde{x}_t)dt+\tilde{a}(t,\tilde{x}_t)\nabla_x\log \tilde{\rho}_t(\tilde{x})dt\\+\tilde{\sigma}(t,\tilde{x}_t)dW_t\]

 

E.g. linear process, score \(\nabla_x\log \tilde{\rho}_t\) is known in closed from

(almost) explicitly computable likelihood ratio:

\[\frac{d\mathbb P^*}{d\tilde{\mathbb P}}=\frac{\tilde{\rho}_T(v)}{\rho_T(v)}\Psi(\tilde{x}_t)\]

van der Meulen, Schauer et al.

Ito stochastic process:

\[dx_t=b(t,x_t)dt+\sigma(t,x_t)dW_t\]

Bridge process:

\[dx^*_t=b(t,x^*_t)dt+a(t,x^*_t)\nabla_x\log \rho_t(x^*_t)dt\\+\sigma(t,x^*_t)dW_t\]

 

Score \(\nabla_x\log \rho_t\) intractable....

v.d. Meulen/Schauer bridges

v.d. Meulen,Schauer,Arnaudon,Sommer,SIIMS'22

From single edges to trees

Bridge:

 

 

Leaf conditioning:

 

\(x_0\)

\(v\)

\(x_0\)

\(h\)

\(v_1\)

van der Meulen, Schauer'20; van der Meulen'22
Stoustrup, Nielsen, van der Meulen, Sommer

\(v_2\)

recursive,leaves to root

Backwards filter:

root to leaves

Forward guiding:

\(v\)

\(v_1\)

\(v_2\)

\(h\)

\(x_0\)

tree

backwards filtering

forwards guiding

MCMC

v.d. Meulen,Schauer,Arnaudon,Sommer,SIIMS'22

Geometry, stochastics, geometric statistics

JaxGeometry: https://github.com/computationalevolutionarymorphometry/jaxgeometry    CCEM: http://www.ccem.dk

Hyperiax:        https://github.com/computationalevolutionarymorphometry/hyperiax          slides: https://slides.com/stefansommer

References:

  • Philipp Harms, Peter W. Michor, Xavier Pennec, Stefan Sommer: Geometry of sample spaces, Diff. Geom. and its Appl., 2023, arXiv:2010.08039
  • Hansen, Eltzner, Huckemann, Sommer: Diffusion Means in Geometric Spaces, Bernoulli, 2023, arXiv:2105.12061
  • Grong, Sommer: Most probable paths for anisotropic Brownian motions on manifolds, FoCM 2022, arXiv:2110.15634
  • Højgaard, Joshi, Sommer: Discrete-Time Observations of Brownian Motion on Lie Groups and Homogeneous Spaces: Sampling and Metric Estimation, Algorithms, 2022,
  • Jensen, Sommer: Mean Estimation on the Diagonal of Product Manifolds, Algorithms, 2022, https://www.mdpi.com/1999-4893/15/3/92
  • Arnaudon, v.d. Meulen, Schauer, Sommer: Diffusion bridges for stochastic Hamiltonian systems and shape evolutions,SIIMS,2022,arXiv:2002.00885
  • Hansen, Eltzner, Sommer: Diffusion Means and Heat Kernel on Manifolds, 2021, GSI 2021, arXiv:2103.00588.
  • Højgaard Jensen, Sommer: Simulation of Conditioned Diffusions on Riemannian Manifolds, 2021, arXiv:2105.13190.
  • Sommer, Bronstein: Horizontal Flows and Manifold Stochastics in Geometric Deep Learning, TPAMI, 2020, doi: 10.1109/TPAMI.2020.2994507
  • Arnaudon, Holm, Sommer: A Geometric Framework for Stochastic Shape Analysis, Foundations of Computational Mathematics, 2019, arXiv:1703.09971.
  • Højgaard Jensen, Mallasto, Sommer: Simulation of Conditioned Diffusions on the Flat Torus, GSI 2019., arXiv:1906.09813.
  • Sommer, Svane: Modelling Anisotropic Covariance using Stochastic Development and Sub-Riemannian Frame Bundle Geometry, JoGM, 2017, arXiv:1512.08544.
  • Sommer: Anisotropically Weighted and Nonholonomically Constrained Evolutions, Entropy, 2017, arXiv:1609.00395 .
  • Sommer, Svane: Modelling Anisotropic Covariance using Stochastic Development and Sub-Riemannian Frame Bundle Geometry, JoGM, 2017, arXiv:1512.08544.
  • Sommer: Anisotropically Weighted and Nonholonomically Constrained Evolutions, Entropy, 2017, arXiv:1609.00395 .
  • Arnaudon, Holm, Sommer: A Stochastic Large Deformation Model for Computational Anatomy, IPMI 2017, arXiv:1612.05323.
  • Sommer: Anisotropic Distributions on Manifolds: Template Estimation and Most Probable Paths, IPMI 2015, doi: 10.1007/978-3-319-19992-4_15.