Riemannian gradient flows in shape analysis

Klas Modin

What can you do with a metric and a function?

Newton's equation on \(M\)
Riemannian gradient flow on \(M\)

\nabla_{\dot x}\dot x = -\nabla V(x)

\nabla_{\dot x}\dot x = -\nabla V(x)

\dot x = -\nabla V(x)

\dot x = -\nabla V(x)

Formal geometric setting

\dot g = -\nabla^{\mathcal G} E(g)

\dot g = -\nabla^{\mathcal G} E(g)

Lie group \(G\) acting on \(Q\)
Template (source) \(q_0\in Q\)
Target \(q_1 \in Q\)
Riemannian metric \(\mathcal G\) on \(G\)
Symmetric under isotropy subgroup \[ \mathcal G_{g\cdot q}(g\cdot \dot q,g\cdot\dot q) = G_q(\dot q,\dot q)\qquad \forall g\in G_{q_0}\]
"Energy" functional \[ E(g) = F_{q_1}(g\cdot q_0)\]

Def: momentum map \(J: T^*Q\to \mathfrak{g}^* \) for cotangent lifted action

\langle J(q,p),\xi \rangle = \langle p,\xi\cdot q \rangle

\langle J(q,p),\xi \rangle = \langle p,\xi\cdot q \rangle

Proposition: gradient flow is

\dot g = - u(g\cdot q_0)\cdot g

\dot g = - u(g\cdot q_0)\cdot g

u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))

u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))

Computing the gradient

Simplest special case: right-invariant metric \[ \mathcal G_{e}(\xi,\xi) = \langle \mathcal A\xi,\xi\rangle, \qquad \mathcal A:\mathfrak g\to\mathfrak g^*\]

-\nabla^{\mathcal G} E(g)

-\nabla^{\mathcal G} E(g)

G

Q

e

q_0

q_0

g(t)

g(t)

q_1

q_1

q(t) = g(t)\cdot q_0

q(t) = g(t)\cdot q_0

Geometric picture

Proposition: gradient flow on \(\mathrm{Orb}_G(q_0)\) is

\dot q = - u(q)\cdot q

\dot q = - u(q)\cdot q

u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))

u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))

Induced gradient flow on \(q_0\)-orbit

\(\mathcal G\) induces metric \(\bar{\mathcal G}\) on \(\mathrm{Orb}_G(q_0)\)

\dot q = -\nabla^{\bar{\mathcal G}} F_{q_1}(q)

\dot q = -\nabla^{\bar{\mathcal G}} F_{q_1}(q)

Typical form

F_{q_1}(q) = d(s,s_1) + \varepsilon e(r,r_0)

F_{q_1}(q) = d(s,s_1) + \varepsilon e(r,r_0)

Choice of \(F_{q_1}\) and regularization

Shape space \(S\)
Regularization space \(R\)
Take \(Q = S\times R\)
Take \(q_0 = (s_0,r_0)\)
Take \(q_1 = (s_1,r_0)\)

S

R

F_{q_1}(g\cdot q_0) = d(g\cdot s_0,s_1) + \varepsilon e(g\cdot r_0,r_0)

F_{q_1}(g\cdot q_0) = d(g\cdot s_0,s_1) + \varepsilon e(g\cdot r_0,r_0)

distance or divergence

r_0

r_0

s_0

s_0

s_1

s_1

Gradient flow

\dot g = - u(g\cdot q_0)\cdot g

\dot g = - u(g\cdot q_0)\cdot g

u(q) = \mathcal A^{-1}J(q,F_{q_*}(q))

u(q) = \mathcal A^{-1}J(q,F_{q_*}(q))

Lie-Euler method

q_k = g_k\cdot q_0

q_k = g_k\cdot q_0

\xi_k = \mathcal{A}^{-1}J(q_k,d F_{q_*})

\xi_k = \mathcal{A}^{-1}J(q_k,d F_{q_*})

g_{k+1} = \exp(-h\xi_k)g_k

g_{k+1} = \exp(-h\xi_k)g_k

Generic time discretization

Guides object-oriented design of shape analysis software

List of examples

Polar decompositions (in optimal transport)
multivariate Gaussians, smooth, Dirac deltas?, entropic regularization? (Schrödinger bridge)
Toda flow
connections to KdV, QR-algorithm, etc, etc
QR decomposition
by "silly" entropy gradient flow
Brockett's flow
new interpretation as entropy gradient flow
Density matching
using optimal information transport
LDDMM
also simple regularization using deformation tensor
Shape reconstruction in tomography

horizontal slice

I

R

fiber

\pi

\pi

fiber

I

W_1

W_1

K

P(n)

P(n)

A

Q

Example: \(QR\) decomposition

\mathrm{Hor}_A = \{ V\in T_A\mathrm{GL}(n) \mid \ell(VA^{-1}) = 0 \}

\mathrm{Hor}_A = \{ V\in T_A\mathrm{GL}(n) \mid \ell(VA^{-1}) = 0 \}

K = \{ R\in \mathrm{GL}(n)\mid \ell(R)=0, R_{ii}>0 \} \Rightarrow T_R K = \mathrm{Hor}_R

K = \{ R\in \mathrm{GL}(n)\mid \ell(R)=0, R_{ii}&gt;0 \} \Rightarrow T_R K = \mathrm{Hor}_R

\(G = \mathrm{GL}(n)\)
\(Q = P(n)\) (sym. pos. def. matrices)
\(\mathcal A\) such that:

\(QR\) example

Manifold of inverse covariance matrices

P(n) = \{ W\in \mathbb{R}^{n\times n}\mid W=W^\top, W>0 \}

P(n) = \{ W\in \mathbb{R}^{n\times n}\mid W=W^\top, W&gt;0 \}

\displaystyle p(x;W^{-1}) = \sqrt{\frac{|W|}{(2\pi)^n}}\exp(-\frac{1}{2}x^\top W x)

\displaystyle p(x;W^{-1}) = \sqrt{\frac{|W|}{(2\pi)^n}}\exp(-\frac{1}{2}x^\top W x)

\(QR\) example

Fisher-Rao metric on \(P(n)\)

T_W P(n) = \{ U\in \mathbb{R}^{n\times n}\mid U=U^\top \}

T_W P(n) = \{ U\in \mathbb{R}^{n\times n}\mid U=U^\top \}

U

W

\bar{\mathcal G}_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)

\bar{\mathcal G}_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)

\(QR\) example

Fisher-Rao invariance

U

W

g_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)

g_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)

\mathrm{GL}(n)\times P(n) \ni (A,W) \mapsto A^\top W A \in P(n)

\mathrm{GL}(n)\times P(n) \ni (A,W) \mapsto A^\top W A \in P(n)

Right action of \(\mathrm{GL}(n)\) on \(P(n)\) is transitive

\bar{\mathcal G}_{A^\top W A}(A^\top U A,A^\top U A) = \bar{\mathcal G}_W(U,U)

\bar{\mathcal G}_{A^\top W A}(A^\top U A,A^\top U A) = \bar{\mathcal G}_W(U,U)

\(QR\) example

Compatible metric on GL(n)

\displaystyle \mathcal G_A(V,V) = \frac{1}{2}\mathrm{tr}\big(\ell(VA^{-1})^\top\ell(VA^{-1})+\sigma(VA^{-1})\sigma(VA^{-1}) \big)

\displaystyle \mathcal G_A(V,V) = \frac{1}{2}\mathrm{tr}\big(\ell(VA^{-1})^\top\ell(VA^{-1})+\sigma(VA^{-1})\sigma(VA^{-1}) \big)

V

A

\(QR\) example

Use relative entropy as "energy"

\displaystyle F_{W_1}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(W_1W^{-1}) + \frac{1}{2}\log(\det(W_1W^{-1}))

\displaystyle F_{W_1}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(W_1W^{-1}) + \frac{1}{2}\log(\det(W_1W^{-1}))

Notice: no regularization used here

\(QR\) example

Final "silly QR" flow

\displaystyle \dot R = \frac{1}{2} R^{-\top}(W_1-R^\top R) + ZR, \qquad Z\in\mathfrak{o}(n)

\displaystyle \dot R = \frac{1}{2} R^{-\top}(W_1-R^\top R) + ZR, \qquad Z\in\mathfrak{o}(n)

\(QR\) example

Convergence

Convexity lemma:

-\mathrm{Hess}(E)_R = \mathcal G_R

-\mathrm{Hess}(E)_R = \mathcal G_R

Corollary:

d^2(R(t),R_\infty) \leq \, \mathrm{e}^{-2 t} d^2(R(0),R_\infty)

d^2(R(t),R_\infty) \leq \, \mathrm{e}^{-2 t} d^2(R(0),R_\infty)

\(QR\) example

R = \begin{bmatrix} 3 & -1 \\ 0 & 2 \end{bmatrix}

R = \begin{bmatrix} 3 &amp; -1 \\ 0 &amp; 2 \end{bmatrix}

W_1 = \pi(R) = R^\top R

W_1 = \pi(R) = R^\top R

Example

\(QR\) example

Example

-\bar H(R(t))

-\bar H(R(t))

\bar d(R(t),R_\infty)^2

\bar d(R(t),R_\infty)^2

\text{slope of }\exp(-2t)

\text{slope of }\exp(-2t)

\(QR\) example

Example: Brockett's flow

E(Q) = \mathrm{tr}(NQ^\top M Q), \qquad M\in P(n), \; N \in D(n)

E(Q) = \mathrm{tr}(NQ^\top M Q), \qquad M\in P(n), \; N \in D(n)

\(G = \mathrm{O}(n)\) with canonical metric (Killing form)
Brockett's energy functional:

Brockett example

Gives gradient flow:

\dot Q = - Q(NQ^\top M Q - Q^\top MQN)

\dot Q = - Q(NQ^\top M Q - Q^\top MQN)

New interpretation

Action of O(n) on P(n) not transitive
Energy functional defined by relative entropy

Brockett example

\displaystyle F_{N}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(NW^{-1}) + \frac{1}{2}\log(\det(NW^{-1}))

\displaystyle F_{N}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(NW^{-1}) + \frac{1}{2}\log(\det(NW^{-1}))

Proposition: Fisher-Rao gradient flow restricted to orbits is

\dot W = [W,[W^{-1},N]]

\dot W = [W,[W^{-1},N]]

Corollary: Expressed in \(\Sigma = W^{-1}\) we get

\dot \Sigma = [\Sigma,[\Sigma,N]]

\dot \Sigma = [\Sigma,[\Sigma,N]]

Double bracket form of Brockett's flow

Example: density matching

\((M,\mathsf{g})\) compact Riemannian manifold
\(G = \mathrm{Diff}(M)\)
\(S = \mathrm{Dens}(M) = \{\rho \in C^\infty(M)\mid \rho>0, \int_M \rho\mu = 1\}\)
\(R = \mathrm{Dens}(M)\), so \(Q=\mathrm{Dens}(M)\times\mathrm{Dens}(M)\)
Action by pushforward (not composition!)
Right-invariant metric \[ \langle v,v\rangle = \int_M \mathsf{g}(v,\Delta v)\mu + \cdots\]
Energy functional \[ F_{\rho_1}(\rho,\sigma) = d_F^2(\rho_1,\rho) + \epsilon d_F^2(1,\sigma)\]

Density example

lots of structure!

(with M. Bauer and S. Joshi)

Riemannian submersion

\mathrm{Diff}(M)

\mathrm{Diff}(M)

\mathrm{Prob}(M)

\mathrm{Prob}(M)

\mathrm{Id}

\mathrm{Id}

\mu

\mu

\mu_0

\mu_0

\pi(\varphi)=\varphi^*\mu

\pi(\varphi)=\varphi^*\mu

\(H^1\) metric

\mathrm{Hor}

\mathrm{Hor}

Fisher-Rao metric = explicit geodesics

Density gradient flow

Density example

Momentum map \(J:T^*\mathrm{Dens}(M)\to \mathfrak X(M)^*\)\[ J(\rho,p) = \rho\nabla p\]

Gradient flow on orbits of \(\mathrm{Dens}(M)\times\mathrm{Dens}(M)\)

\dot \rho = - v\cdot \nabla \rho - \rho\, \mathrm{div}(v)

\dot \rho = - v\cdot \nabla \rho - \rho\, \mathrm{div}(v)

\dot \sigma = - v\cdot \nabla \sigma - \sigma\, \mathrm{div}(v)

\dot \sigma = - v\cdot \nabla \sigma - \sigma\, \mathrm{div}(v)

\displaystyle \Delta v = \rho \nabla \frac{\delta F_{\rho_1}}{\delta\rho}(\rho,\sigma) + \varepsilon\sigma \nabla \frac{\delta F_{\rho_1}}{\delta\sigma}(\rho,\sigma)

\displaystyle \Delta v = \rho \nabla \frac{\delta F_{\rho_1}}{\delta\rho}(\rho,\sigma) + \varepsilon\sigma \nabla \frac{\delta F_{\rho_1}}{\delta\sigma}(\rho,\sigma)

Application: draw samples from non-uniform distribution on \(M\)

Problem 1: given \(\mu\in\mathrm{Dens}(M)\) generate \(N\) samples from \(\mu\)

Most cases: use Monte-Carlo based methods

Special case here:

\(M\) low dimensional
\(\mu\) very non-uniform
\(N\) very large

transport map approach

might be useful

Density example

Optimal transport problem

Problem 2: given \(\mu\in\mathrm{Dens}(M)\) find \(\varphi\in\mathrm{Diff}(M)\) minimizing

under constraint \(\varphi_*\mu_0 = \mu\)

Studied case: (Moselhy and Marzouk 2012, Reich 2013, ...)

\(\mathrm{dist}\) = \(L^2\)-Wasserstein distance
\(\Rightarrow\) optimal mass transport problem
\(\Rightarrow\) solve Monge-Ampere equation (heavily non-linear PDE)

E(\varphi) = \mathrm{dist}(\mathrm{id},\varphi)^2

E(\varphi) = \mathrm{dist}(\mathrm{id},\varphi)^2

Our notion:

use optimal information transport

Density example

Simple 2D example

Warp computation time (256*256 gridsize, 100 time-steps): ~1s

Sample computation time (10^7 samples): < 1s

Density example

NOT good for image registration

\(\rho_0\)

\(\rho_1\)

Density example

...but works with regularization!

\(\rho_0\)

\(\rho_1\)

Density example

Application: CT of breathing lung

Data: breathing cycle of rat, CT imaging

Density example

Results

Regularized density flow

LDDMM

Density example

Example: LDDMM

\(M\) Riemannian manifold
\(G = \mathrm{Diff}(M)\)
\(S = C^\infty(M)\)
\(R = \mathrm{Diff}(M)\)
Action by composition
Right-invariant \(H^1\) metric from \(\mathrm{Met}\)
Energy functional \[ F_{I_1}(I,\eta) = \vert I_1-I \vert_{L^2}^2 + \epsilon d_{\mathcal A}^2(\mathrm{id},\eta)\]

LDDMM example

Why is LDDMM computationally expensive?

Because \( \frac{\delta d^2_\mathcal A(\mathrm{id},\cdot)}{\delta \eta}\) is expensive

Deformation regularization

\((M,\mathsf g)\) Riemannian manifold
\(G = \mathrm{Diff}(M)\)
\(S = C^\infty(M)\)
\(R = \mathrm{Met}(M)\)
Action by composition and push-forward
Right-invariant metric defined by kernel
Energy functional \[ F_{I_1}(I,\mathsf h) = \vert I_1-I \vert_{L^2}^2 + \epsilon d_{\mathrm{Met}}^2(\mathsf{g},\mathsf h)\]
That is \[E(\varphi) = \vert I_1-I_0\circ\varphi^{-1} \vert_{L^2}^2 + \epsilon d_{\mathrm{Met}}^2(\mathsf{g},\varphi_*\mathsf g)\]

LDDMM example

explicit formula (cf. Peter's talk)

(cf. Joshi, Pennec, and others)

deformation

tensor

Resulting gradient flow

Momentum map \(J:T^*(C^\infty(M)\times\mathrm{Met}(M))\to \mathfrak X(M)^*\)\[ J(I,\mathsf h,p,\alpha) = p\nabla I + \vert \mathsf h\vert^{1/2}\mathrm{div}(\alpha)\]

Gradient flow on orbit in \(C^\infty(M)\times\mathrm{Met}(M)\)

\dot I = - v\cdot \nabla I

\dot I = - v\cdot \nabla I

\dot \mathsf h = - \mathcal L_v \mathsf h

\dot \mathsf h = - \mathcal L_v \mathsf h

\displaystyle \mathcal A v = (I-I_0)\nabla I + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)

\displaystyle \mathcal A v = (I-I_0)\nabla I + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)

LDDMM example

Example: reconstruction

\((M,\mathsf{g})\) Riemannian manifold
\(G = \mathrm{Diff}(M)\)
\(S = \) some shape space (image or density)
\(R = \mathrm{Dens}(M)\) or \(\mathrm{Met}(M)\)
Forward model \(T: S \to \mathcal H\) (think ray-transform)
Energy functional

Reconstruction example

(with O. Öktem)

E(\varphi) = \vert T(\varphi_*\rho_0) - \mathbf{d}\vert_{\mathcal H}^2 + \varepsilon \mathrm{dist}^2(\mathsf{g},\varphi_*\mathsf{g})

E(\varphi) = \vert T(\varphi_*\rho_0) - \mathbf{d}\vert_{\mathcal H}^2 + \varepsilon \mathrm{dist}^2(\mathsf{g},\varphi_*\mathsf{g})

Resulting gradient flow

Reconstruction example

Gradient flow on orbit in \(\mathrm{Dens}(M)\times\mathrm{Met}(M)\)

\dot \rho = - \mathrm{div}(\rho v)

\dot \rho = - \mathrm{div}(\rho v)

\dot \mathsf h = - \mathcal L_v \mathsf h

\dot \mathsf h = - \mathcal L_v \mathsf h

\displaystyle \mathcal A v = \rho\nabla T^*(T\rho-\mathbf d) + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)

\displaystyle \mathcal A v = \rho\nabla T^*(T\rho-\mathbf d) + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)

Preliminary numerical results

Reconstruction example

THANKS!

References

M.
Geometry of Matrix Decompositions Seen Through Optimal Transport and Information Geometry
J. Geom. Mech., 9(3):335-390, 2017
Bauer, Joshi, M.
Diffeomorphic density matching
by optimal information transport
SIAM J. Imaging Sci., 8(3):1718-1751, 2015

Slides available at: slides.com/kmodin

Riemannian gradient flows in shape analysis

By Klas Modin

Riemannian gradient flows in shape analysis

Presentation given 2017-11-13 at the Isaac Newton Institute in Cambridge.

2,332

Klas Modin PRO

Mathematician at Chalmers University of Technology and the University of Gothenburg

klasmodin.github.io

Riemannian gradient flows in shape analysis

Klas Modin

What can you do with a metric and a function?

Formal geometric setting

Computing the gradient

Geometric picture

Induced gradient flow on \(q_0\)-orbit

Choice of \(F_{q_1}\) and regularization

Generic time discretization

List of examples

Example: \(QR\) decomposition

Manifold of inverse covariance matrices

Fisher-Rao metric on \(P(n)\)

Fisher-Rao invariance

Compatible metric on GL(n)

Use relative entropy as "energy"

Final "silly QR" flow

Convergence

Example

Example

Example: Brockett's flow

New interpretation

Example: density matching

Riemannian submersion

Density gradient flow

Application: draw samples from non-uniform distribution on \(M\)

Optimal transport problem

Simple 2D example

NOT good for image registration

...but works with regularization!

Application: CT of breathing lung

Results

Example: LDDMM

Deformation regularization

Resulting gradient flow

Example: reconstruction

Resulting gradient flow

Preliminary numerical results

THANKS!

Riemannian gradient flows in shape analysis

More from Klas Modin