# Riemannian gradient flows in shape analysis

## What can you do with a metric and a function?

• Newton's equation on $$M$$

• Riemannian gradient flow on $$M$$
\nabla_{\dot x}\dot x = -\nabla V(x)
$\nabla_{\dot x}\dot x = -\nabla V(x)$
\dot x = -\nabla V(x)
$\dot x = -\nabla V(x)$

## Formal geometric setting

\dot g = -\nabla^{\mathcal G} E(g)
$\dot g = -\nabla^{\mathcal G} E(g)$
• Lie group $$G$$ acting on $$Q$$
• Template (source) $$q_0\in Q$$
• Target $$q_1 \in Q$$
• Riemannian metric $$\mathcal G$$ on $$G$$
• Symmetric under isotropy subgroup $\mathcal G_{g\cdot q}(g\cdot \dot q,g\cdot\dot q) = G_q(\dot q,\dot q)\qquad \forall g\in G_{q_0}$
• "Energy" functional $E(g) = F_{q_1}(g\cdot q_0)$

Def: momentum map $$J: T^*Q\to \mathfrak{g}^*$$ for cotangent lifted action

\langle J(q,p),\xi \rangle = \langle p,\xi\cdot q \rangle
$\langle J(q,p),\xi \rangle = \langle p,\xi\cdot q \rangle$

\dot g = - u(g\cdot q_0)\cdot g
$\dot g = - u(g\cdot q_0)\cdot g$
u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))
$u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))$

Simplest special case: right-invariant metric $\mathcal G_{e}(\xi,\xi) = \langle \mathcal A\xi,\xi\rangle, \qquad \mathcal A:\mathfrak g\to\mathfrak g^*$

-\nabla^{\mathcal G} E(g)
$-\nabla^{\mathcal G} E(g)$
G
$G$
Q
$Q$
e
$e$
q_0
$q_0$
g(t)
$g(t)$
q_1
$q_1$
q(t) = g(t)\cdot q_0
$q(t) = g(t)\cdot q_0$

## Geometric picture

Proposition: gradient flow on $$\mathrm{Orb}_G(q_0)$$ is

\dot q = - u(q)\cdot q
$\dot q = - u(q)\cdot q$
u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))
$u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))$

## Induced gradient flow on $$q_0$$-orbit

$$\mathcal G$$ induces metric $$\bar{\mathcal G}$$ on $$\mathrm{Orb}_G(q_0)$$

\dot q = -\nabla^{\bar{\mathcal G}} F_{q_1}(q)
$\dot q = -\nabla^{\bar{\mathcal G}} F_{q_1}(q)$

Typical form

F_{q_1}(q) = d(s,s_1) + \varepsilon e(r,r_0)
$F_{q_1}(q) = d(s,s_1) + \varepsilon e(r,r_0)$

## Choice of $$F_{q_1}$$ and regularization

• Shape space $$S$$
• Regularization space $$R$$
• Take $$Q = S\times R$$
• Take $$q_0 = (s_0,r_0)$$
• Take $$q_1 = (s_1,r_0)$$
S
$S$
R
$R$
F_{q_1}(g\cdot q_0) = d(g\cdot s_0,s_1) + \varepsilon e(g\cdot r_0,r_0)
$F_{q_1}(g\cdot q_0) = d(g\cdot s_0,s_1) + \varepsilon e(g\cdot r_0,r_0)$

distance or divergence

r_0
$r_0$
s_0
$s_0$
s_1
$s_1$

\dot g = - u(g\cdot q_0)\cdot g
$\dot g = - u(g\cdot q_0)\cdot g$
u(q) = \mathcal A^{-1}J(q,F_{q_*}(q))
$u(q) = \mathcal A^{-1}J(q,F_{q_*}(q))$

Lie-Euler method

q_k = g_k\cdot q_0
$q_k = g_k\cdot q_0$
\xi_k = \mathcal{A}^{-1}J(q_k,d F_{q_*})
$\xi_k = \mathcal{A}^{-1}J(q_k,d F_{q_*})$
g_{k+1} = \exp(-h\xi_k)g_k
$g_{k+1} = \exp(-h\xi_k)g_k$

## Generic time discretization

Guides object-oriented design of shape analysis software

## List of examples

• Polar decompositions (in optimal transport)
multivariate Gaussians, smooth, Dirac deltas?, entropic regularization? (Schrödinger bridge)
• Toda flow
connections to KdV, QR-algorithm, etc, etc
• QR decomposition
• Brockett's flow
new interpretation as entropy gradient flow
• Density matching
using optimal information transport
• LDDMM
also simple regularization using deformation tensor
• Shape reconstruction in tomography

horizontal slice

I
$I$
R
$R$

fiber

\pi
$\pi$

fiber

I
$I$
W_1
$W_1$
K
$K$
P(n)
$P(n)$
A
$A$
Q
$Q$

## Example: $$QR$$ decomposition

\mathrm{Hor}_A = \{ V\in T_A\mathrm{GL}(n) \mid \ell(VA^{-1}) = 0 \}
$\mathrm{Hor}_A = \{ V\in T_A\mathrm{GL}(n) \mid \ell(VA^{-1}) = 0 \}$
K = \{ R\in \mathrm{GL}(n)\mid \ell(R)=0, R_{ii}>0 \} \Rightarrow T_R K = \mathrm{Hor}_R
$K = \{ R\in \mathrm{GL}(n)\mid \ell(R)=0, R_{ii}>0 \} \Rightarrow T_R K = \mathrm{Hor}_R$
• $$G = \mathrm{GL}(n)$$
• $$Q = P(n)$$ (sym. pos. def. matrices)
• $$\mathcal A$$ such that:

$$QR$$ example

## Manifold of inverse covariance matrices

P(n) = \{ W\in \mathbb{R}^{n\times n}\mid W=W^\top, W>0 \}
$P(n) = \{ W\in \mathbb{R}^{n\times n}\mid W=W^\top, W>0 \}$
\displaystyle p(x;W^{-1}) = \sqrt{\frac{|W|}{(2\pi)^n}}\exp(-\frac{1}{2}x^\top W x)
$\displaystyle p(x;W^{-1}) = \sqrt{\frac{|W|}{(2\pi)^n}}\exp(-\frac{1}{2}x^\top W x)$

$$QR$$ example

## Fisher-Rao metric on $$P(n)$$

T_W P(n) = \{ U\in \mathbb{R}^{n\times n}\mid U=U^\top \}
$T_W P(n) = \{ U\in \mathbb{R}^{n\times n}\mid U=U^\top \}$
U
$U$
W
$W$
\bar{\mathcal G}_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)
$\bar{\mathcal G}_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)$

$$QR$$ example

## Fisher-Rao invariance

U
$U$
W
$W$
g_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)
$g_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)$
\mathrm{GL}(n)\times P(n) \ni (A,W) \mapsto A^\top W A \in P(n)
$\mathrm{GL}(n)\times P(n) \ni (A,W) \mapsto A^\top W A \in P(n)$

Right action of $$\mathrm{GL}(n)$$ on $$P(n)$$ is transitive

\bar{\mathcal G}_{A^\top W A}(A^\top U A,A^\top U A) = \bar{\mathcal G}_W(U,U)
$\bar{\mathcal G}_{A^\top W A}(A^\top U A,A^\top U A) = \bar{\mathcal G}_W(U,U)$

$$QR$$ example

## Compatible metric on GL(n)

\displaystyle \mathcal G_A(V,V) = \frac{1}{2}\mathrm{tr}\big(\ell(VA^{-1})^\top\ell(VA^{-1})+\sigma(VA^{-1})\sigma(VA^{-1}) \big)
$\displaystyle \mathcal G_A(V,V) = \frac{1}{2}\mathrm{tr}\big(\ell(VA^{-1})^\top\ell(VA^{-1})+\sigma(VA^{-1})\sigma(VA^{-1}) \big)$
V
$V$
A
$A$

$$QR$$ example

## Use relative entropy as "energy"

\displaystyle F_{W_1}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(W_1W^{-1}) + \frac{1}{2}\log(\det(W_1W^{-1}))
$\displaystyle F_{W_1}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(W_1W^{-1}) + \frac{1}{2}\log(\det(W_1W^{-1}))$

Notice: no regularization used here

$$QR$$ example

## Final "silly QR" flow

\displaystyle \dot R = \frac{1}{2} R^{-\top}(W_1-R^\top R) + ZR, \qquad Z\in\mathfrak{o}(n)
$\displaystyle \dot R = \frac{1}{2} R^{-\top}(W_1-R^\top R) + ZR, \qquad Z\in\mathfrak{o}(n)$

$$QR$$ example

## Convergence

Convexity lemma:

-\mathrm{Hess}(E)_R = \mathcal G_R
$-\mathrm{Hess}(E)_R = \mathcal G_R$

Corollary:

d^2(R(t),R_\infty) \leq \, \mathrm{e}^{-2 t} d^2(R(0),R_\infty)
$d^2(R(t),R_\infty) \leq \, \mathrm{e}^{-2 t} d^2(R(0),R_\infty)$

$$QR$$ example

R = \begin{bmatrix} 3 & -1 \\ 0 & 2 \end{bmatrix}
$R = \begin{bmatrix} 3 & -1 \\ 0 & 2 \end{bmatrix}$
W_1 = \pi(R) = R^\top R
$W_1 = \pi(R) = R^\top R$

## Example

$$QR$$ example

## Example

-\bar H(R(t))
$-\bar H(R(t))$
\bar d(R(t),R_\infty)^2
$\bar d(R(t),R_\infty)^2$
\text{slope of }\exp(-2t)
$\text{slope of }\exp(-2t)$

$$QR$$ example

## Example: Brockett's flow

E(Q) = \mathrm{tr}(NQ^\top M Q), \qquad M\in P(n), \; N \in D(n)
$E(Q) = \mathrm{tr}(NQ^\top M Q), \qquad M\in P(n), \; N \in D(n)$
• $$G = \mathrm{O}(n)$$ with canonical metric (Killing form)
• Brockett's energy functional:

Brockett example

\dot Q = - Q(NQ^\top M Q - Q^\top MQN)
$\dot Q = - Q(NQ^\top M Q - Q^\top MQN)$

## New interpretation

• Action of O(n) on P(n) not transitive
• Energy functional defined by relative entropy

Brockett example

\displaystyle F_{N}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(NW^{-1}) + \frac{1}{2}\log(\det(NW^{-1}))
$\displaystyle F_{N}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(NW^{-1}) + \frac{1}{2}\log(\det(NW^{-1}))$

Proposition: Fisher-Rao gradient flow restricted to orbits is

\dot W = [W,[W^{-1},N]]
$\dot W = [W,[W^{-1},N]]$

Corollary: Expressed in $$\Sigma = W^{-1}$$ we get

\dot \Sigma = [\Sigma,[\Sigma,N]]
$\dot \Sigma = [\Sigma,[\Sigma,N]]$

Double bracket form of Brockett's flow

## Example: density matching

• $$(M,\mathsf{g})$$ compact Riemannian manifold
• $$G = \mathrm{Diff}(M)$$
• $$S = \mathrm{Dens}(M) = \{\rho \in C^\infty(M)\mid \rho>0, \int_M \rho\mu = 1\}$$
• $$R = \mathrm{Dens}(M)$$, so $$Q=\mathrm{Dens}(M)\times\mathrm{Dens}(M)$$
• Action by pushforward (not composition!)
• Right-invariant metric $\langle v,v\rangle = \int_M \mathsf{g}(v,\Delta v)\mu + \cdots$
• Energy functional $F_{\rho_1}(\rho,\sigma) = d_F^2(\rho_1,\rho) + \epsilon d_F^2(1,\sigma)$

Density example

lots of structure!

(with M. Bauer and S. Joshi)

## Riemannian submersion

\mathrm{Diff}(M)
$\mathrm{Diff}(M)$
\mathrm{Prob}(M)
$\mathrm{Prob}(M)$
\mathrm{Id}
$\mathrm{Id}$
\mu
$\mu$
\mu_0
$\mu_0$
\pi(\varphi)=\varphi^*\mu
$\pi(\varphi)=\varphi^*\mu$

$$H^1$$ metric

\mathrm{Hor}
$\mathrm{Hor}$

Fisher-Rao metric = explicit geodesics

Density example

• Momentum map $$J:T^*\mathrm{Dens}(M)\to \mathfrak X(M)^*$$$J(\rho,p) = \rho\nabla p$

Gradient flow on orbits of $$\mathrm{Dens}(M)\times\mathrm{Dens}(M)$$

\dot \rho = - v\cdot \nabla \rho - \rho\, \mathrm{div}(v)
$\dot \rho = - v\cdot \nabla \rho - \rho\, \mathrm{div}(v)$
\dot \sigma = - v\cdot \nabla \sigma - \sigma\, \mathrm{div}(v)
$\dot \sigma = - v\cdot \nabla \sigma - \sigma\, \mathrm{div}(v)$
\displaystyle \Delta v = \rho \nabla \frac{\delta F_{\rho_1}}{\delta\rho}(\rho,\sigma) + \varepsilon\sigma \nabla \frac{\delta F_{\rho_1}}{\delta\sigma}(\rho,\sigma)
$\displaystyle \Delta v = \rho \nabla \frac{\delta F_{\rho_1}}{\delta\rho}(\rho,\sigma) + \varepsilon\sigma \nabla \frac{\delta F_{\rho_1}}{\delta\sigma}(\rho,\sigma)$

## Application: draw samples from non-uniform distribution on $$M$$

Problem 1: given $$\mu\in\mathrm{Dens}(M)$$ generate $$N$$ samples from $$\mu$$

Most cases: use Monte-Carlo based methods

Special case here:

• $$M$$ low dimensional
• $$\mu$$ very non-uniform
• $$N$$ very large

transport map approach

might be useful

Density example

## Optimal transport problem

Problem 2: given $$\mu\in\mathrm{Dens}(M)$$ find  $$\varphi\in\mathrm{Diff}(M)$$ minimizing

under constraint $$\varphi_*\mu_0 = \mu$$

Studied case: (Moselhy and Marzouk 2012, Reich 2013, ...)

• $$\mathrm{dist}$$ = $$L^2$$-Wasserstein distance
• $$\Rightarrow$$ optimal mass transport problem
• $$\Rightarrow$$ solve Monge-Ampere equation (heavily non-linear PDE)
E(\varphi) = \mathrm{dist}(\mathrm{id},\varphi)^2
$E(\varphi) = \mathrm{dist}(\mathrm{id},\varphi)^2$

Our notion:

• use optimal information transport

Density example

## Simple 2D example

Warp computation time (256*256 gridsize, 100 time-steps): ~1s

Sample computation time (10^7 samples): < 1s

Density example

## NOT good for image registration

$$\rho_0$$

$$\rho_1$$

Density example

## ...but works with regularization!

$$\rho_0$$

$$\rho_1$$

Density example

## Application: CT of breathing lung

Data: breathing cycle of rat, CT imaging

Density example

## Results

Regularized density flow

LDDMM

Density example

## Example: LDDMM

• $$M$$ Riemannian manifold
• $$G = \mathrm{Diff}(M)$$
• $$S = C^\infty(M)$$
• $$R = \mathrm{Diff}(M)$$
• Action by composition
• Right-invariant $$H^1$$ metric from $$\mathrm{Met}$$
• Energy functional $F_{I_1}(I,\eta) = \vert I_1-I \vert_{L^2}^2 + \epsilon d_{\mathcal A}^2(\mathrm{id},\eta)$

LDDMM example

Why is LDDMM computationally expensive?

Because $$\frac{\delta d^2_\mathcal A(\mathrm{id},\cdot)}{\delta \eta}$$ is expensive

## Deformation regularization

• $$(M,\mathsf g)$$ Riemannian manifold
• $$G = \mathrm{Diff}(M)$$
• $$S = C^\infty(M)$$
• $$R = \mathrm{Met}(M)$$
• Action by composition and push-forward
• Right-invariant metric defined by kernel
• Energy functional $F_{I_1}(I,\mathsf h) = \vert I_1-I \vert_{L^2}^2 + \epsilon d_{\mathrm{Met}}^2(\mathsf{g},\mathsf h)$
• That is $E(\varphi) = \vert I_1-I_0\circ\varphi^{-1} \vert_{L^2}^2 + \epsilon d_{\mathrm{Met}}^2(\mathsf{g},\varphi_*\mathsf g)$

LDDMM example

explicit formula (cf. Peter's talk)

(cf. Joshi, Pennec, and others)

deformation

tensor

• Momentum map $$J:T^*(C^\infty(M)\times\mathrm{Met}(M))\to \mathfrak X(M)^*$$$J(I,\mathsf h,p,\alpha) = p\nabla I + \vert \mathsf h\vert^{1/2}\mathrm{div}(\alpha)$

Gradient flow on orbit in $$C^\infty(M)\times\mathrm{Met}(M)$$

\dot I = - v\cdot \nabla I
$\dot I = - v\cdot \nabla I$
\dot \mathsf h = - \mathcal L_v \mathsf h
$\dot \mathsf h = - \mathcal L_v \mathsf h$
\displaystyle \mathcal A v = (I-I_0)\nabla I + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)
$\displaystyle \mathcal A v = (I-I_0)\nabla I + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)$

LDDMM example

## Example: reconstruction

• $$(M,\mathsf{g})$$ Riemannian manifold
• $$G = \mathrm{Diff}(M)$$
• $$S =$$ some shape space (image or density)
• $$R = \mathrm{Dens}(M)$$ or $$\mathrm{Met}(M)$$
• Forward model $$T: S \to \mathcal H$$ (think ray-transform)
• Energy functional

Reconstruction example

(with O. Öktem)

E(\varphi) = \vert T(\varphi_*\rho_0) - \mathbf{d}\vert_{\mathcal H}^2 + \varepsilon \mathrm{dist}^2(\mathsf{g},\varphi_*\mathsf{g})
$E(\varphi) = \vert T(\varphi_*\rho_0) - \mathbf{d}\vert_{\mathcal H}^2 + \varepsilon \mathrm{dist}^2(\mathsf{g},\varphi_*\mathsf{g})$

Reconstruction example

Gradient flow on orbit in $$\mathrm{Dens}(M)\times\mathrm{Met}(M)$$

\dot \rho = - \mathrm{div}(\rho v)
$\dot \rho = - \mathrm{div}(\rho v)$
\dot \mathsf h = - \mathcal L_v \mathsf h
$\dot \mathsf h = - \mathcal L_v \mathsf h$
\displaystyle \mathcal A v = \rho\nabla T^*(T\rho-\mathbf d) + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)
$\displaystyle \mathcal A v = \rho\nabla T^*(T\rho-\mathbf d) + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)$

## Preliminary numerical results

Reconstruction example

# THANKS!

References

Slides available at: slides.com/kmodin

By Klas Modin

# Riemannian gradient flows in shape analysis

Presentation given 2017-11-13 at the Isaac Newton Institute in Cambridge.

• 1,801