## Riemannian principal bundles

E
E/H\simeq B
\mathrm{Hor}
E
\hookleftarrow H
\downarrow
B
\pi

Invariant Riemannian metric on $$E$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

## Riemannian principal bundles

E
E/H\simeq B
\mathrm{Hor}
E
\hookleftarrow H
\downarrow
B
\pi

Invariant Riemannian metric on $$E$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

## Riemannian principal bundles

G
G/H\simeq B
e
\mathrm{Hor}
G
\hookleftarrow H
\downarrow
B
\pi

left co-sets $$[g] = g\cdot H$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
\mathrm{Hor}
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0
G_{b_0}

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
\text{horizontal flow}
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
\text{vertical flow}
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
g\in G
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0
b_1 = \pi(g) = g\cdot b_0

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
\gamma(1)\in G
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0
b_1 = \pi(g) = g\cdot b_0
\gamma'(t)\in \mathrm{Hor}

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
g = \gamma(1)(\gamma(1)^{-1}g)
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0
b_1 = \pi(g) = g\cdot b_0
\gamma'(t)\in \mathrm{Hor}

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
g = \gamma(1)h
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0
b_1 = \pi(g) = g\cdot b_0
\gamma'(t)\in \mathrm{Hor}

## Riemannian principal bundles

G
G/G_{b_0}\simeq B
e
g = k h
G
\hookleftarrow G_{b_0}
\downarrow
B
\pi(g) = g\cdot b_0

left co-sets $$[g] = g\cdot G_{b_0}$$

Semi-invariant Riemannian metric on $$G$$

$$\Rightarrow$$ $$\pi$$ Riemannian submersion

b_0
b_1 = \pi(g) = g\cdot b_0
K

polar cone

## Optimal mass transport (OMT)

\mu_0
\mu_1
\varphi_*\mu_0
\displaystyle\min_{\varphi*\mu_0=\mu_1} \int_{M} d_M^2(\varphi(x),x ) \mu_0
M

Monge problem, $$L^2$$ version

## Optimal mass transport (OMT)

\mu_0
\mu_1
\varphi_*\mu_0
\displaystyle\min_{\varphi*\mu_0=\mu_1} \int_{\mathbb{R}^n} \lvert \varphi(x)-x \rvert^2 \mu_0
\mathbb{R}^n

Monge problem, $$L^2$$ version

## Riemannian structure of OMT

\mathrm{Diff}(M)
\mathrm{Dens}(M)\simeq \mathrm{Diff}(M)/\mathrm{Diff}_{\mu_0}(M)
\mathrm{Id}
\mu_0
\mu_1
\pi(\varphi)=\varphi_*\mu_0

Riemannian metric

\displaystyle\mathcal{G}_\varphi(\dot\varphi,\dot\varphi) = \int_{M}\left\vert \dot\varphi \right\vert^2 \mu_0

Induced metric

\overline{\mathcal{G}}_\mu(\dot\mu,\dot\mu) = \int_M \lvert \nabla \theta\rvert^2 \mu

[Benamou & Brenier (2000), Otto (2001)]

Invariance: $$\eta\in\mathrm{Diff}_{\mu_0}(M)$$

\displaystyle\mathcal{G}_\varphi(\dot\varphi,\dot\varphi) = \mathcal{G}_{\varphi\circ\eta}(\dot\varphi\circ\eta,\dot\varphi\circ\eta)
\dot\rho + \operatorname{div}(\rho \nabla\theta) = 0, \; \rho = \mu/dx

## Geodesics on $$\operatorname{Diff}(\mathbb{R}^n)$$

Geodesic equation:

\ddot\varphi = 0 \Rightarrow \varphi(t) = \mathrm{Id} + t\,v_0
\mathrm{Id}
\mu_0
K
\mu_1
\mathrm{Hor}_{\mathrm{Id}} = \nabla C^\infty(\mathbb{R}^n)

Easy to prove:

Polar cone $$K$$ is isomorphic to strictly convex smooth functions via $$\phi \mapsto \nabla\phi$$

Hard to prove:

Polar cone $$K$$ a section of principal bundle

## Geodesics on $$\operatorname{Diff}(\mathbb{R}^n)$$

Geodesic equation:

\ddot\varphi = 0 \Rightarrow \varphi(t) = \mathrm{Id} + t\,v_0
\mathrm{Id}
\mu_0
K
\mu_1
\varphi = \nabla\phi\circ\eta, \; \eta\in \operatorname{Diff}_{\mu_0}(\mathbb{R}^n)

Easy to prove:

Polar cone $$K$$ is isomorphic to strictly convex smooth functions via $$\phi \mapsto \nabla\phi$$

Hard to prove:

Polar cone $$K$$ a section of principal bundle

Brenier's decomposition of transport maps

## Geodesic distance on $$\operatorname{Diff}(\mathbb{R}^n)$$

Geodesic curve:

\varphi(t) = (1-t)\varphi_0 + t \varphi_1
\mathrm{Id}
\mu_0
K
\mu_1
\mathrm{dist}(\varphi_0,\varphi_1)^2 = \int_0^1 \mathcal{G}_{\varphi(t)}(\dot\varphi(t),\dot\varphi(t)) dt
= \int_{\mathbb{R}^n} \lvert \varphi_1(x)-\varphi_0(t) \rvert^2
= \int_0^1\int_{\mathbb{R}^n} \lvert \dot\varphi(t)\rvert^2\mu_0 dt

In particular:

J(\varphi) = \mathrm{dist}(\mathrm{Id},\varphi)^2

## Monge-Ampere equation on $$\mathbb{R}^n$$

Geodesic curve:

\varphi(t) = \nabla( |x|^2/2 + t f )
\mathrm{Id}
\mu_0
K
\mu_1
(\nabla\phi)_*\mu_0 = \mu_1

In particular:

J(\varphi) = \mathrm{dist}(\mathrm{Id},\varphi)^2
\underbrace{\phantom{klaklklsklkasl}}_{\nabla\phi}
\displaystyle \Rightarrow \operatorname{det}(\nabla^2\phi) = \frac{\rho_0}{\rho_1\circ \nabla\phi}

## Monge-Ampere equation on $$\mathbb{R}^n$$

Geodesic curve:

\varphi(t) = \nabla( |x|^2/2 + t f )
\mathrm{Id}
\mu_0
K
\mu_1
(\nabla\phi)_*\mu_0 = \mu_1

In particular:

J(\varphi) = \mathrm{dist}(\mathrm{Id},\varphi)^2
\underbrace{\phantom{klaklklsklkasl}}_{\nabla\phi}
\displaystyle \Rightarrow \operatorname{det}(\nabla^2\phi) = \frac{\rho_0}{\rho_1\circ \nabla\phi}

## Linear optimal mass transport

Trivial observation:   $$\varphi_0(x) = A_0 x$$, $$\varphi_1(x) = A_1 x$$   linear diffeomorphisms $$\Rightarrow$$ geodesic consists of linear diffeomorphisms

Consequence: $$GL(n)$$ is totally geodesic subgroup of $$\operatorname{Diff}(\mathbb{R}^n)$$

Corresponding subspace of densities (statistical submanifold): multivariate Gaussians with zero mean

\displaystyle \rho(x) = \frac{1}{\sqrt{(2\pi)^n\mathrm{det}(\Sigma)}}\mathrm{exp}(-\frac{1}{2}x^\top \Sigma^{-1}x)
\Sigma \in P(n)

## Bundle structure

GL(n)
P(n)\simeq GL(n)/O(n,\Sigma_0)
I
\Sigma_0
\Sigma_1
\pi(A)=A\Sigma_0 A^\top
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K

## Bundle structure

GL(n)
P(n)\simeq GL(n)/O(n,\Sigma_0)
I
\Sigma_0
\Sigma_1
\pi(A)=A\Sigma_0 A^\top
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

## Bundle structure

GL(n)
P(n)\simeq GL(n)/O(n,\Sigma_0)
I
\Sigma_0
\Sigma_1
\pi(P)=P\Sigma_0 P = \Sigma_1
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

Monge-Ampere equation:

P

## Bundle structure

GL(n)
P(n)\simeq GL(n)/O(n,\Sigma_0)
I
\Sigma_0
\Sigma_1
A = PQ
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

Factorization theorem:

A
P

## Bundle structure

GL(n)
P(n)\simeq GL(n)/O(n,\Sigma_0)
I
\Sigma_0
\Sigma_1
\dot B = -\mathrm{Pr}\nabla_{\mathcal G}J(B), \; B(0) = A
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

A

## Bundle structure

GL(n)
P(n)\simeq GL(n)/O(n,\Sigma_0)
I
\Sigma_0
\Sigma_1
\dot B = \Omega B, \; B(0) = A
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

A
\Sigma_1 \Omega + \Omega\Sigma_1 = 2\Sigma_1 (B^{-1}-B^{-\top})

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot B = \Omega B, \; B(0) = A
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

A
\Sigma_1 \Omega + \Omega\Sigma_1 = 2\Sigma_1 (B^{-1}-B^{-\top})

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot \Sigma = \mathrm{Pr}\nabla_{\bar{\mathcal G}}H_{\Sigma_1}(\Sigma)
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

\Sigma(t)
\displaystyle H_{\Sigma_1}(\Sigma) = -\frac{1}{2}\mathrm{tr}(\Sigma_1^{-1}\Sigma) + \frac{1}{2}\log\det(\Sigma_1^{-1}\Sigma)

Relative entropy

(Kullback-Leibler)

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot \Sigma = 2I - \Sigma_1^{-1}\Sigma - \Sigma\Sigma_1^{-1}
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

\Sigma(t)
\displaystyle H_{\Sigma_1}(\Sigma) = -\frac{1}{2}\mathrm{tr}(\Sigma_1^{-1}\Sigma) + \frac{1}{2}\log\det(\Sigma_1^{-1}\Sigma)

Relative entropy

(Kullback-Leibler)

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot P = P^{-1}\Sigma_0^{-1} - \Sigma_1^{-1}P + V
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

P(t)
\displaystyle F(P) = H_{\Sigma_1}(P\Sigma_0 P)

Lifted gradient flow on $$K$$ for

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot P = P^{-1}\Sigma_0^{-1} - \Sigma_1^{-1}P + V
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

P(t)
\displaystyle F(P) = H_{\Sigma_1}(P\Sigma_0 P)

Lifted gradient flow on $$K$$ for

Hessian of $$F(P)$$ strictly positive on $$K$$ $$\Rightarrow$$ unique limit!

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot P = P^{-1}\Sigma_0^{-1} - \Sigma_1^{-1}P + V
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

P(t)
\displaystyle F(P) = H_{\Sigma_1}(P\Sigma_0 P)

Lifted gradient flow on $$K$$ for

Hessian of $$F(P)$$ strictly positive on $$K$$ $$\Rightarrow$$ unique limit!

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot P = P^{-1}\Sigma_0^{-1} - \Sigma_1^{-1}P + V
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

P(t)
\displaystyle F(P) = H_{\Sigma_1}(P\Sigma_0 P)

Lifted gradient flow on $$K$$ for

Hessian of $$F(P)$$ strictly positive on $$K$$ $$\Rightarrow$$ unique limit!

## Bundle structure

GL(n)
P(n)
I
\Sigma_0
\Sigma_1
\dot P = P^{-1}\Sigma_0^{-1} - \Sigma_1^{-1}P + V
\mathcal G_A(\dot A,\dot A) = \mathrm{tr}(\Sigma_0 \dot A^\top \dot A)
\bar{\mathcal G}_\Sigma(\dot \Sigma,\dot \Sigma) = \mathrm{tr}(\Sigma SS)
\dot\Sigma = S\Sigma + \Sigma S
K = P(n)

P(t)
\displaystyle F(P) = H_{\Sigma_1}(P\Sigma_0 P)

Lifted gradient flow on $$K$$ for

Hessian of $$F(P)$$ strictly positive on $$K$$ $$\Rightarrow$$ unique limit!

## Wasserstein-Otto vs. Fisher-Rao

\mathrm{Dens}(M)
T_\mu\mathrm{Dens}(M)\simeq C^\infty_0(M)

Wasserstein

Fisher-Rao

\displaystyle\overline{\mathcal{G}}_\mu(\dot\mu,\dot\mu) = \int_{M} \frac{\dot\mu}{\mu}\frac{\dot\mu}{\mu}\mu
\displaystyle\overline{\mathcal{G}}_{\rho dx}(\dot\rho dx,\dot\rho dx) = \int_{M} |\nabla\theta|^2\rho
\displaystyle \dot\rho + \mathrm{div}(\rho \nabla\theta) = 0

Dependent on Riemannian structure of $$M$$

Independent of Riemannian structure of $$M \Rightarrow \mathrm{Diff}(M)$$-invariance

\displaystyle \rho(x) = \frac{1}{\sqrt{(2\pi)^n\mathrm{det}(\Sigma)}}\mathrm{exp}(-\frac{1}{2}x^\top \Sigma^{-1}x)
\displaystyle \rho(x) = \sqrt{\frac{\mathrm{det}(W)}{(2\pi)^n}}\mathrm{exp}(-\frac{1}{2}x^\top W x)

## Brockett flow

P(n)
N
W_1
D(n)

$$H_N(W)$$ relative entropy functional

Functional $$F(Q) = H_N(Q^\top W_1 Q)$$ on $$O(n)$$

\mathrm{Orb}(W_1)
\displaystyle H_{N}(W) = -\frac{1}{2}\mathrm{tr}(N W^{-1}) + \frac{1}{2}\log\det(N W^{-1})

Relative entropy

# Heat flow

Wasserstein-Otto metric

\overline{\mathcal{G}}_\mu(\dot\mu,\dot\mu) = \int_M \lvert \nabla \theta\rvert^2 \mu
\dot\rho + \operatorname{div}(\rho \nabla\theta) = 0, \; \rho = \mu/dx

$$\Rightarrow$$ Riemannian gradient flow $$\dot\rho = -\nabla_{\overline{\mathcal G}}F(\rho)$$

\dot\rho = \operatorname{div}(\rho \nabla\frac{\delta F}{\delta\rho})

Take $$F(\rho) = \int_M \log(\rho) \rho \Rightarrow \delta F = \log(\rho)+1$$

\Rightarrow \; \dot\rho = \Delta\rho

# IPM and Toda

same potential, different Riemannian metrics:

IPM: $$L^2$$ on velocity ($$H^{1}$$ on stream function)

TODA: $$H^{-1}$$ on velocity ($$L^2$$ on stream function)

gradients flows on $$\mathrm{Diff}_\mu(S^2)$$

gravity

low density

(light particles)

high density

(heavy particles)

\mathrm{Diff}_\mu(M^2)
\mathrm{Id}
\rho_0
\rho_1

# Summary

• IPM and Toda $$\Rightarrow$$ Riemannian gradient flows on $$\mathrm{Diff}_\mu(M)$$ (or quantized on $$\mathrm{SO}(n)$$)
• same potential function
• different (right-invariant) Riemannian metrics
IPM: $$L^2$$         Toda: $$H^{-1}$$
• Stronger metric $$\Rightarrow$$ more regular flow
IPM: ODE on $$\mathrm{Diff}_\mu^s(M)$$ (for $$s>2$$)
Toda: not ODE on $$\mathrm{Diff}_\mu^s(M)$$

By Klas Modin

# Wasserstein-Otto geometry

Tutorial talk given 2023-11 in Banff.

• 160