Théo Dumont
PhD student in optimal transport & geometry @ Université Gustave Eiffel
Théo Dumont
slides available at https://slides.com/theodumont/geometric-ot
Gaspard Monge
1746-1818
Leonid Kantorovitch
1912-1986
Yann Brenier
1957-
Definition. (OT Kantorovitch problem)
not feasible by a map!
\(\pi\) is induced by a transport map \(\varphi\)
\(\pi\) is a transport plan
relaxation
\(|\!\det d\varphi^{-1}|\mu_0 \circ \varphi^{-1}\)
[Monge, 1781], [Kantorovitch, 1942]
Optimal Transport
Definition. (OT Monge problem)
Can we say that the solution of (K) is a map?
?
If \(\mu_0\) has a density, then there is a unique solution to (K), and it is of the form \(\varphi=\nabla f\) with \(f:\mathbb R^n\to\mathbb R\) convex.
Theorem. (Brenier)
relaxation
\(\pi\) is induced by a transport map \(\varphi\)
\(\pi\) is a transport plan
Monge (maps)
Kantorovitch (plans)
[Brenier, 1987]
Optimal Transport
Smooth densities:
can we recover classical results of OT theory with a geometric picture?
?
Definition. (Smooth OT problem)
Smooth OT
Smooth maps:
not right-invariant! only by action of \(\operatorname{Diff}_{\mu_0}(\mathbb R^n)\)
!
Theorem. (Geodesic equation on \(\operatorname{Diff}(\mathbb R^n)\))
or if we write \(\dot\varphi=v\circ\varphi\),
inviscid Burgers \(\dot v+\nabla_v v=0\)
horizontal?
finding the shortest geodesic from \(\operatorname{id}\) to \(\mathcal C(\mu_0,\mu_1)\)
\(\displaystyle \text{OT}(\mu_0,\mu_1)=\min_{\varphi\,\in \,\mathcal C(\mu_0,\mu_1)} d^2(\operatorname{id},\varphi)\)
A first link with OT
[Otto, 2001], [Kriegl & Michor, 1997], [Modin, 2015]
Diffeomorphism group
\(\pi: \operatorname{Diff}(\mathbb R^n)\longrightarrow\operatorname{Dens}(\mathbb R^n)\)
\(\varphi\longmapsto\varphi_*\mu_0\)
Fiber over \(\mu_1\):
\(\{\varphi\mid \varphi_*\mu_0=\mu_1\}=\mathcal C(\mu_0,\mu_1)\)
Fiber over \(\mu_0\):
\(\{\varphi\mid \varphi_*\mu_0=\mu_0\}=\operatorname{Diff}_{\mu_0}(\mathbb R^n)\)
\(d\pi(\varphi): T_\varphi\operatorname{Diff}(\mathbb R^n)\longrightarrow T_{\mu}\operatorname{Dens}(\mathbb R^n)\)
\(v\circ\varphi\longmapsto -\operatorname{div}(\mu v)\)
The submersion
\(\pi:\varphi\mapsto\varphi_*\mu_0\) is a smooth submersion.
vertical distribution:
horizontal distribution:
Any \(u\in\mathfrak X(\mathbb R^n)\) can be written as
\(u=v+\nabla p\)
with \(\operatorname{div}(\mu_0 v)=0\) and \(p\in C^{\infty}(\mathbb R^n)\).
Theorem. (Helmholtz/hodge decomposition)
right-invariance under action of fiber \(\operatorname{Diff}_{\mu_0}(\mathbb R^n)\)
\(\pi\) induces a metric on \(\operatorname{Diff}(\mathbb R^n)/\operatorname{Diff}_{\mu_0}(\mathbb R^n)\simeq\operatorname{Dens}(\mathbb R^n)\)
independent
of \(G\)
depends
on \(G\)!
The submersion
(smooth submersion)
\(\pi\) is a Riemannian submersion
Pythagoras \(\implies\dot\varphi\) has to be horizontal!
where \(\dot\mu\) and \(\nabla p\) are linked by \(\dot\mu=-\operatorname{div}(\mu\nabla p)\)
Theorem. (Induced metric on \(\operatorname{Dens}(\mathbb R^n)\))
where \(\Delta_\mu p=\operatorname{div}(\mu\nabla p)\).
The submersion
(Riemannian submersion)
what do the geodesics look like in \(\operatorname{Dens}(\mathbb R^n)\)?
?
geodesics in \(\operatorname{Dens}(\mathbb R^n)\)
horizontal geodesics in \(\operatorname{Diff}(\mathbb R^n)\)
\(\iff\)
(Hamilton-Jacobi)
(continuity equation)
Theorem. (Geodesic equation on \(\operatorname{Dens}(\mathbb R^n)\))
The induced geodesic distance on \(\operatorname{Dens}(\mathbb R^n)\) is the OT distance!
(see previous computation \(d^2(\operatorname{id},\varphi)=\text{OT}(\mu_0,\mu_1)\))
at time \(t=1\): \(\varphi_1=\operatorname{id}+\nabla p=\nabla f\)
what's the final map?
?
Brenier's theorem
The submersion
(Riemannian submersion)
\(\implies\) this is just finding the curve of minimal energy between \(\mu_0\) and \(\mu_1\) in \(\operatorname{Dens}(\mathbb R^n)\),
i.e. finding a (horizontal) geodesic!
where \(\dot\mu_t+\operatorname{div}(\mu_t v_t)=0\), and where \(\mu_t\) has the right endpoints.
[Benamou & Brenier, 2000]
Usefulness:
Applications
Benamou-Brenier
Theorem. (Benamou-Brenier, dynamic formulation of OT)
Any \(\varphi\in\operatorname{Diff}(\mathbb R^n)\) can be written as
\(\varphi=\nabla f\circ\phi\)
with \(f\in C^{\infty}(\mathbb R^n)\) and \(\phi\in\operatorname{Diff}_{\mu_0}(\mathbb R^n)\).
[Brenier, 1987]
Applications
Polar factorization
Theorem. (Polar factorization)
The gradient flow of \(\mu\) w.r.t. a functional \(F\) is
Fréchet derivative
Example:
entropy
potential
+ heat flow
Fokker-Planck equation.
Applications
Gradient flows
Definition. (Gradient flow)
OT = finding the shortest geodesic from \(\operatorname{id}\) to constraint set \(\mathcal C(\mu_0,\mu_1)\)
horizontal
We recover:
cf. Benamou-Brenier formulation
Riemannian submersion \((\operatorname{Diff}(\mathbb R^n),L^2(\mu_0))\overset{\pi}{\longrightarrow}(\operatorname{Dens}(\mathbb R^n), \text{OT})\)
Geodesic equation on... | is... |
---|---|
inviscid Burgers | |
incompressible Euler | |
Hamilton-Jacobi + contin. eqn. |
Gradient flow of... | is... |
---|---|
entropy | heat flow |
entropy + potential | Fokker-Planck |
loss functional L | training inf. wide NN |
[Chizat & Bach, 2018]
(local)
Recap
OT | Inform. theory | Unbalanced OT | LDDMM | Metamorphoses | |
---|---|---|---|---|---|
top space | |||||
metric | |||||
right-invariant? | |||||
bottom space | anything with an action |
anything with an action | |||
action | ... | ... | |||
metric | Wasserstein | Fisher-Rao | Wasserstein-Fisher-Rao | induced metric | induced metric |
[Bauer, Bruveris & Michor, 2016], [Gallouët & Vialard, 2018], [Younes, 2010], [Trouvé & Younes, 2005], [Modin, 2015]
finite-dimensional equivalents when restricting to Gaussian measures:
submersion \(\operatorname{GL}(n)\to \operatorname{PSD}(n)\),
induces Bures-Wasserstein and Fisher-Rao
Some other submersions
L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows: in metric spaces and in the space of probability measures, 2005.
J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem, 2000.
M. Bauer, M. Bruveris, and P.W. Michor. Uniqueness of the Fisher–Rao metric on the space of smooth densities, 2016.
Y. Brenier. Décomposition polaire et réarrangement monotone des champs de vecteurs, 1987.
L. Chizat and F. Bach. On the global convergence of gradient descent for over-parameterized models using optimal transport, 2018.
W. Gangbo, H.K. Kim, and T. Pacini. Differential forms on Wasserstein space and infinite-dimensional Hamiltonian systems, 2010.
T. Gallouët and F.-X. Vialard. The Camassa–Holm equation as an incompressible Euler equation: A geometric point of view, 2018.
A. Kriegl and P.W. Michor. The convenient setting of global analysis, 1997.
K. Modin. Geometry of matrix decompositions seen through optimal transport and information geometry, 2016.
F. Otto. The geometry of dissipative evolution equations: the porous medium equation, 2001.
A. Trouvé and L. Younes. Metamorphoses through lie group action, 2005.
L. Younes. Shapes and diffeomorphisms, 2010.
slides available at https://slides.com/theodumont/geometric-ot
•
•
•
•
•
•
•
•
•
•
•
•
References
By Théo Dumont
Talk about the infinite-dimensional Riemannian geometry of Optimal Transport for the shape analysis seminar (https://shape-analysis.github.io/) at the MAP5 lab.
PhD student in optimal transport & geometry @ Université Gustave Eiffel