Riemannian gradient flows in shape analysis

Klas Modin

What can you do with a metric and a function?

  • Newton's equation on \(M\)



     
  • Riemannian gradient flow on \(M\)
\nabla_{\dot x}\dot x = -\nabla V(x)
x˙x˙=V(x)\nabla_{\dot x}\dot x = -\nabla V(x)
\dot x = -\nabla V(x)
x˙=V(x)\dot x = -\nabla V(x)

Formal geometric setting

\dot g = -\nabla^{\mathcal G} E(g)
g˙=GE(g)\dot g = -\nabla^{\mathcal G} E(g)
  • Lie group \(G\) acting on \(Q\)
  • Template (source) \(q_0\in Q\)
  • Target \(q_1 \in Q\)
  • Riemannian metric \(\mathcal G\) on \(G\)
  • Symmetric under isotropy subgroup \[ \mathcal G_{g\cdot q}(g\cdot \dot q,g\cdot\dot q) = G_q(\dot q,\dot q)\qquad \forall g\in G_{q_0}\]
  • "Energy" functional \[ E(g) = F_{q_1}(g\cdot q_0)\]

Def: momentum map \(J: T^*Q\to \mathfrak{g}^* \) for cotangent lifted action

 

 

\langle J(q,p),\xi \rangle = \langle p,\xi\cdot q \rangle
J(q,p),ξ=p,ξq\langle J(q,p),\xi \rangle = \langle p,\xi\cdot q \rangle

Proposition: gradient flow is

 

 

 

 

\dot g = - u(g\cdot q_0)\cdot g
g˙=u(gq0)g\dot g = - u(g\cdot q_0)\cdot g
u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))
u(q):=A1J(q,dFq1(q))u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))

Computing the gradient

Simplest special case: right-invariant metric \[ \mathcal G_{e}(\xi,\xi) = \langle \mathcal A\xi,\xi\rangle, \qquad \mathcal A:\mathfrak g\to\mathfrak g^*\]

-\nabla^{\mathcal G} E(g)
GE(g)-\nabla^{\mathcal G} E(g)
G
GG
Q
QQ
e
ee
q_0
q0q_0
g(t)
g(t)g(t)
q_1
q1q_1
q(t) = g(t)\cdot q_0
q(t)=g(t)q0q(t) = g(t)\cdot q_0

Geometric picture

Proposition: gradient flow on \(\mathrm{Orb}_G(q_0)\) is

 

 

 

 

\dot q = - u(q)\cdot q
q˙=u(q)q\dot q = - u(q)\cdot q
u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))
u(q):=A1J(q,dFq1(q))u(q) := \mathcal A^{-1}J(q,d F_{q_1}(q))

Induced gradient flow on \(q_0\)-orbit

\(\mathcal G\) induces metric \(\bar{\mathcal G}\) on \(\mathrm{Orb}_G(q_0)\)

\dot q = -\nabla^{\bar{\mathcal G}} F_{q_1}(q)
q˙=GˉFq1(q)\dot q = -\nabla^{\bar{\mathcal G}} F_{q_1}(q)

Typical form

 

 

 

 

F_{q_1}(q) = d(s,s_1) + \varepsilon e(r,r_0)
Fq1(q)=d(s,s1)+εe(r,r0)F_{q_1}(q) = d(s,s_1) + \varepsilon e(r,r_0)

Choice of \(F_{q_1}\) and regularization

  • Shape space \(S\)
  • Regularization space \(R\)
  • Take \(Q = S\times R\)
  • Take \(q_0 = (s_0,r_0)\)
  • Take \(q_1 = (s_1,r_0)\)
S
SS
R
RR
F_{q_1}(g\cdot q_0) = d(g\cdot s_0,s_1) + \varepsilon e(g\cdot r_0,r_0)
Fq1(gq0)=d(gs0,s1)+εe(gr0,r0)F_{q_1}(g\cdot q_0) = d(g\cdot s_0,s_1) + \varepsilon e(g\cdot r_0,r_0)

distance or divergence

r_0
r0r_0
s_0
s0s_0
s_1
s1s_1

Gradient flow

 

 

 

\dot g = - u(g\cdot q_0)\cdot g
g˙=u(gq0)g\dot g = - u(g\cdot q_0)\cdot g
u(q) = \mathcal A^{-1}J(q,F_{q_*}(q))
u(q)=A1J(q,Fq(q))u(q) = \mathcal A^{-1}J(q,F_{q_*}(q))

Lie-Euler method

 

 

 

 

 

q_k = g_k\cdot q_0
qk=gkq0q_k = g_k\cdot q_0
\xi_k = \mathcal{A}^{-1}J(q_k,d F_{q_*})
ξk=A1J(qk,dFq)\xi_k = \mathcal{A}^{-1}J(q_k,d F_{q_*})
g_{k+1} = \exp(-h\xi_k)g_k
gk+1=exp(hξk)gkg_{k+1} = \exp(-h\xi_k)g_k

Generic time discretization

Guides object-oriented design of shape analysis software

List of examples

  • Polar decompositions (in optimal transport)
    multivariate Gaussians, smooth, Dirac deltas?, entropic regularization? (Schrödinger bridge)
  • Toda flow
    connections to KdV, QR-algorithm, etc, etc
  • QR decomposition
    by "silly" entropy gradient flow
  • Brockett's flow
    new interpretation as entropy gradient flow
  • Density matching
    using optimal information transport
  • LDDMM
    also simple regularization using deformation tensor
  • Shape reconstruction in tomography

horizontal slice

I
II
R
RR

fiber

\pi
π\pi

fiber

I
II
W_1
W1W_1
K
KK
P(n)
P(n)P(n)
A
AA
Q
QQ

Example: \(QR\) decomposition

\mathrm{Hor}_A = \{ V\in T_A\mathrm{GL}(n) \mid \ell(VA^{-1}) = 0 \}
HorA={VTAGL(n)(VA1)=0}\mathrm{Hor}_A = \{ V\in T_A\mathrm{GL}(n) \mid \ell(VA^{-1}) = 0 \}
K = \{ R\in \mathrm{GL}(n)\mid \ell(R)=0, R_{ii}>0 \} \Rightarrow T_R K = \mathrm{Hor}_R
K={RGL(n)(R)=0,Rii>0}TRK=HorRK = \{ R\in \mathrm{GL}(n)\mid \ell(R)=0, R_{ii}>0 \} \Rightarrow T_R K = \mathrm{Hor}_R
  • \(G = \mathrm{GL}(n)\)
  • \(Q = P(n)\) (sym. pos. def. matrices)
  • \(\mathcal A\) such that:

\(QR\) example

Manifold of inverse covariance matrices

P(n) = \{ W\in \mathbb{R}^{n\times n}\mid W=W^\top, W>0 \}
P(n)={WRn×nW=W,W>0}P(n) = \{ W\in \mathbb{R}^{n\times n}\mid W=W^\top, W>0 \}
\displaystyle p(x;W^{-1}) = \sqrt{\frac{|W|}{(2\pi)^n}}\exp(-\frac{1}{2}x^\top W x)
p(x;W1)=W(2π)nexp(12xWx)\displaystyle p(x;W^{-1}) = \sqrt{\frac{|W|}{(2\pi)^n}}\exp(-\frac{1}{2}x^\top W x)

\(QR\) example

Fisher-Rao metric on \(P(n)\)

T_W P(n) = \{ U\in \mathbb{R}^{n\times n}\mid U=U^\top \}
TWP(n)={URn×nU=U}T_W P(n) = \{ U\in \mathbb{R}^{n\times n}\mid U=U^\top \}
U
UU
W
WW
\bar{\mathcal G}_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)
GˉW(U,U)=12tr(W1UW1U)\bar{\mathcal G}_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)

\(QR\) example

Fisher-Rao invariance

U
UU
W
WW
g_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)
gW(U,U)=12tr(W1UW1U)g_W(U,U) = \frac{1}{2}\mathrm{tr}(W^{-1}UW^{-1}U)
\mathrm{GL}(n)\times P(n) \ni (A,W) \mapsto A^\top W A \in P(n)
GL(n)×P(n)(A,W)AWAP(n)\mathrm{GL}(n)\times P(n) \ni (A,W) \mapsto A^\top W A \in P(n)

Right action of \(\mathrm{GL}(n)\) on \(P(n)\) is transitive

\bar{\mathcal G}_{A^\top W A}(A^\top U A,A^\top U A) = \bar{\mathcal G}_W(U,U)
GˉAWA(AUA,AUA)=GˉW(U,U)\bar{\mathcal G}_{A^\top W A}(A^\top U A,A^\top U A) = \bar{\mathcal G}_W(U,U)

\(QR\) example

Compatible metric on GL(n)

\displaystyle \mathcal G_A(V,V) = \frac{1}{2}\mathrm{tr}\big(\ell(VA^{-1})^\top\ell(VA^{-1})+\sigma(VA^{-1})\sigma(VA^{-1}) \big)
GA(V,V)=12tr((VA1)(VA1)+σ(VA1)σ(VA1))\displaystyle \mathcal G_A(V,V) = \frac{1}{2}\mathrm{tr}\big(\ell(VA^{-1})^\top\ell(VA^{-1})+\sigma(VA^{-1})\sigma(VA^{-1}) \big)
V
VV
A
AA

\(QR\) example

Use relative entropy as "energy"

\displaystyle F_{W_1}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(W_1W^{-1}) + \frac{1}{2}\log(\det(W_1W^{-1}))
FW1(W)=n212tr(W1W1)+12log(det(W1W1))\displaystyle F_{W_1}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(W_1W^{-1}) + \frac{1}{2}\log(\det(W_1W^{-1}))

Notice: no regularization used here

\(QR\) example

Final "silly QR" flow

\displaystyle \dot R = \frac{1}{2} R^{-\top}(W_1-R^\top R) + ZR, \qquad Z\in\mathfrak{o}(n)
R˙=12R(W1RR)+ZR,Zo(n)\displaystyle \dot R = \frac{1}{2} R^{-\top}(W_1-R^\top R) + ZR, \qquad Z\in\mathfrak{o}(n)

\(QR\) example

Convergence

Convexity lemma:

 

 

-\mathrm{Hess}(E)_R = \mathcal G_R
Hess(E)R=GR-\mathrm{Hess}(E)_R = \mathcal G_R

Corollary:

 

 

d^2(R(t),R_\infty) \leq \, \mathrm{e}^{-2 t} d^2(R(0),R_\infty)
d2(R(t),R) e2td2(R(0),R) d^2(R(t),R_\infty) \leq \, \mathrm{e}^{-2 t} d^2(R(0),R_\infty)

\(QR\) example

R = \begin{bmatrix} 3 & -1 \\ 0 & 2 \end{bmatrix}
R=[3102]R = \begin{bmatrix} 3 & -1 \\ 0 & 2 \end{bmatrix}
W_1 = \pi(R) = R^\top R
W1=π(R)=RRW_1 = \pi(R) = R^\top R

Example

\(QR\) example

Example

-\bar H(R(t))
Hˉ(R(t))-\bar H(R(t))
\bar d(R(t),R_\infty)^2
dˉ(R(t),R)2\bar d(R(t),R_\infty)^2
\text{slope of }\exp(-2t)
slope of exp(2t)\text{slope of }\exp(-2t)

\(QR\) example

Example: Brockett's flow

E(Q) = \mathrm{tr}(NQ^\top M Q), \qquad M\in P(n), \; N \in D(n)
E(Q)=tr(NQMQ),MP(n),  ND(n)E(Q) = \mathrm{tr}(NQ^\top M Q), \qquad M\in P(n), \; N \in D(n)
  • \(G = \mathrm{O}(n)\) with canonical metric (Killing form)
  • Brockett's energy functional:

Brockett example

  • Gives gradient flow:
\dot Q = - Q(NQ^\top M Q - Q^\top MQN)
Q˙=Q(NQMQQMQN)\dot Q = - Q(NQ^\top M Q - Q^\top MQN)

New interpretation

  • Action of O(n) on P(n) not transitive
  • Energy functional defined by relative entropy

Brockett example

\displaystyle F_{N}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(NW^{-1}) + \frac{1}{2}\log(\det(NW^{-1}))
FN(W)=n212tr(NW1)+12log(det(NW1))\displaystyle F_{N}(W) = \frac{n}{2}- \frac{1}{2}\mathrm{tr}(NW^{-1}) + \frac{1}{2}\log(\det(NW^{-1}))

Proposition: Fisher-Rao gradient flow restricted to orbits is

 

 

\dot W = [W,[W^{-1},N]]
W˙=[W,[W1,N]]\dot W = [W,[W^{-1},N]]

Corollary: Expressed in \(\Sigma = W^{-1}\) we get

 

\dot \Sigma = [\Sigma,[\Sigma,N]]
Σ˙=[Σ,[Σ,N]]\dot \Sigma = [\Sigma,[\Sigma,N]]

Double bracket form of Brockett's flow

Example: density matching

  • \((M,\mathsf{g})\) compact Riemannian manifold
  • \(G = \mathrm{Diff}(M)\)
  • \(S = \mathrm{Dens}(M) = \{\rho \in C^\infty(M)\mid \rho>0, \int_M \rho\mu = 1\}\)
  • \(R = \mathrm{Dens}(M)\), so \(Q=\mathrm{Dens}(M)\times\mathrm{Dens}(M)\)
  • Action by pushforward (not composition!)
  • Right-invariant metric \[ \langle v,v\rangle = \int_M \mathsf{g}(v,\Delta v)\mu + \cdots\]
  • Energy functional \[ F_{\rho_1}(\rho,\sigma) = d_F^2(\rho_1,\rho) + \epsilon d_F^2(1,\sigma)\]

Density example

lots of structure!

(with M. Bauer and S. Joshi)

Riemannian submersion

\mathrm{Diff}(M)
Diff(M)\mathrm{Diff}(M)
\mathrm{Prob}(M)
Prob(M)\mathrm{Prob}(M)
\mathrm{Id}
Id\mathrm{Id}
\mu
μ\mu
\mu_0
μ0\mu_0
\pi(\varphi)=\varphi^*\mu
π(φ)=φμ\pi(\varphi)=\varphi^*\mu

\(H^1\) metric

\mathrm{Hor}
Hor\mathrm{Hor}

Fisher-Rao metric = explicit geodesics

Density gradient flow

Density example

  • Momentum map \(J:T^*\mathrm{Dens}(M)\to \mathfrak X(M)^*\)\[ J(\rho,p) = \rho\nabla p\]

Gradient flow on orbits of \(\mathrm{Dens}(M)\times\mathrm{Dens}(M)\)

 

 

 

 

 

\dot \rho = - v\cdot \nabla \rho - \rho\, \mathrm{div}(v)
ρ˙=vρρ div(v)\dot \rho = - v\cdot \nabla \rho - \rho\, \mathrm{div}(v)
\dot \sigma = - v\cdot \nabla \sigma - \sigma\, \mathrm{div}(v)
σ˙=vσσ div(v)\dot \sigma = - v\cdot \nabla \sigma - \sigma\, \mathrm{div}(v)
\displaystyle \Delta v = \rho \nabla \frac{\delta F_{\rho_1}}{\delta\rho}(\rho,\sigma) + \varepsilon\sigma \nabla \frac{\delta F_{\rho_1}}{\delta\sigma}(\rho,\sigma)
Δv=ρδFρ1δρ(ρ,σ)+εσδFρ1δσ(ρ,σ)\displaystyle \Delta v = \rho \nabla \frac{\delta F_{\rho_1}}{\delta\rho}(\rho,\sigma) + \varepsilon\sigma \nabla \frac{\delta F_{\rho_1}}{\delta\sigma}(\rho,\sigma)

Application: draw samples from non-uniform distribution on \(M\)

Problem 1: given \(\mu\in\mathrm{Dens}(M)\) generate \(N\) samples from \(\mu\)

Most cases: use Monte-Carlo based methods

Special case here:

  • \(M\) low dimensional
  • \(\mu\) very non-uniform
  • \(N\) very large

transport map approach

might be useful

Density example

Optimal transport problem

Problem 2: given \(\mu\in\mathrm{Dens}(M)\) find  \(\varphi\in\mathrm{Diff}(M)\) minimizing


 

under constraint \(\varphi_*\mu_0 = \mu\)

Studied case: (Moselhy and Marzouk 2012, Reich 2013, ...)

  • \(\mathrm{dist}\) = \(L^2\)-Wasserstein distance
  • \(\Rightarrow\) optimal mass transport problem
  • \(\Rightarrow\) solve Monge-Ampere equation (heavily non-linear PDE)
E(\varphi) = \mathrm{dist}(\mathrm{id},\varphi)^2
E(φ)=dist(id,φ)2E(\varphi) = \mathrm{dist}(\mathrm{id},\varphi)^2

Our notion:

  • use optimal information transport

Density example

Simple 2D example

Warp computation time (256*256 gridsize, 100 time-steps): ~1s

Sample computation time (10^7 samples): < 1s

Density example

NOT good for image registration

\(\rho_0\)

\(\rho_1\)

Density example

...but works with regularization!

\(\rho_0\)

\(\rho_1\)

Density example

Application: CT of breathing lung

Data: breathing cycle of rat, CT imaging

Density example

Results

Regularized density flow

LDDMM

Density example

Example: LDDMM

  • \(M\) Riemannian manifold
  • \(G = \mathrm{Diff}(M)\)
  • \(S = C^\infty(M)\)
  • \(R = \mathrm{Diff}(M)\)
  • Action by composition
  • Right-invariant \(H^1\) metric from \(\mathrm{Met}\)
  • Energy functional \[ F_{I_1}(I,\eta) = \vert I_1-I \vert_{L^2}^2 + \epsilon d_{\mathcal A}^2(\mathrm{id},\eta)\]

LDDMM example

Why is LDDMM computationally expensive?

Because \( \frac{\delta d^2_\mathcal A(\mathrm{id},\cdot)}{\delta \eta}\) is expensive

Deformation regularization

  • \((M,\mathsf g)\) Riemannian manifold
  • \(G = \mathrm{Diff}(M)\)
  • \(S = C^\infty(M)\)
  • \(R = \mathrm{Met}(M)\)
  • Action by composition and push-forward
  • Right-invariant metric defined by kernel
  • Energy functional \[ F_{I_1}(I,\mathsf h) = \vert I_1-I \vert_{L^2}^2 + \epsilon d_{\mathrm{Met}}^2(\mathsf{g},\mathsf h)\]
  • That is \[E(\varphi) = \vert I_1-I_0\circ\varphi^{-1} \vert_{L^2}^2 + \epsilon d_{\mathrm{Met}}^2(\mathsf{g},\varphi_*\mathsf g)\]

LDDMM example

explicit formula (cf. Peter's talk)

(cf. Joshi, Pennec, and others)

deformation

tensor

Resulting gradient flow

  • Momentum map \(J:T^*(C^\infty(M)\times\mathrm{Met}(M))\to \mathfrak X(M)^*\)\[ J(I,\mathsf h,p,\alpha) = p\nabla I + \vert \mathsf h\vert^{1/2}\mathrm{div}(\alpha)\]

Gradient flow on orbit in \(C^\infty(M)\times\mathrm{Met}(M)\)

 

 

 

 

 

\dot I = - v\cdot \nabla I
I˙=vI\dot I = - v\cdot \nabla I
\dot \mathsf h = - \mathcal L_v \mathsf h
h˙=Lvh\dot \mathsf h = - \mathcal L_v \mathsf h
\displaystyle \mathcal A v = (I-I_0)\nabla I + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)
Av=(II0)I+εh1/2div(hg)\displaystyle \mathcal A v = (I-I_0)\nabla I + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)

LDDMM example

Example: reconstruction

  • \((M,\mathsf{g})\) Riemannian manifold
  • \(G = \mathrm{Diff}(M)\)
  • \(S = \) some shape space (image or density)
  • \(R = \mathrm{Dens}(M)\) or \(\mathrm{Met}(M)\)
  • Forward model \(T: S \to \mathcal H\) (think ray-transform)
  • Energy functional

Reconstruction example

(with O. Öktem)

E(\varphi) = \vert T(\varphi_*\rho_0) - \mathbf{d}\vert_{\mathcal H}^2 + \varepsilon \mathrm{dist}^2(\mathsf{g},\varphi_*\mathsf{g})
E(φ)=T(φρ0)dH2+εdist2(g,φg)E(\varphi) = \vert T(\varphi_*\rho_0) - \mathbf{d}\vert_{\mathcal H}^2 + \varepsilon \mathrm{dist}^2(\mathsf{g},\varphi_*\mathsf{g})

Resulting gradient flow

Reconstruction example

Gradient flow on orbit in \(\mathrm{Dens}(M)\times\mathrm{Met}(M)\)

 

 

 

 

 

\dot \rho = - \mathrm{div}(\rho v)
ρ˙=div(ρv)\dot \rho = - \mathrm{div}(\rho v)
\dot \mathsf h = - \mathcal L_v \mathsf h
h˙=Lvh\dot \mathsf h = - \mathcal L_v \mathsf h
\displaystyle \mathcal A v = \rho\nabla T^*(T\rho-\mathbf d) + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)
Av=ρT(Tρd)+εh1/2div(hg)\displaystyle \mathcal A v = \rho\nabla T^*(T\rho-\mathbf d) + \varepsilon \vert\mathsf h\vert^{1/2}\mathrm{div}(\mathsf h - \mathsf g)

Preliminary numerical results

Reconstruction example

THANKS!

References

Slides available at: slides.com/kmodin

Riemannian gradient flows in shape analysis

By Klas Modin

Riemannian gradient flows in shape analysis

Presentation given 2017-11-13 at the Isaac Newton Institute in Cambridge.

  • 1,873