Vertical gradient flows for optimal transport

Erik Jansson

(Joint work with Klas Modin)

The Monge problem

\mathbb{R}^n

\mu_0

\mu_1

\varphi_* \mu_0

\min_{\varphi \in \operatorname{Diff}(\mathbb{R}^n)} J(\varphi) = \int_{\mathbb{R}^n} |\varphi(x)-x|^2 \mathrm{d}\mu_0(x) \\ \text{s.t. } \varphi \in C(\mu_0,\mu_1) = \{\varphi \in \operatorname{Diff}(\mathbb R^n)| \varphi_* \mu_0 = \mu_1\}

Geometric structure

\operatorname{Diff}(\mathbb R^n)/\operatorname{Diff}_{\mu_0}(\mathbb R^n)\\ \cong \operatorname{Dens}(\mathbb R^n)

In brief: Gradient flow horizontally or vertically

\mu_0

\mu_1

\text{Id}

\operatorname{Dens}(\mathbb{R}^n)

\operatorname{Diff}(\mathbb{R}^n)

(More info: Modin, 2017 and references therein)

The Gaussian Monge problem

\(\mu_0\) and \(\mu_1\) are both (zero-mean) normal distributions on \(\mathbb{R}^n\).

Normal distributions \(\cong\) \(P(n)\), positive-definite symmetric matrices

\mu_0

\mu_1

\mathcal{N}(0,\Sigma_0)

\mathcal{N}(0,\Sigma_1)

\min J (A ) = \operatorname{Tr}((I - A )\Sigma_0 (I - A )^T ) \\ \text{s.t. } A\Sigma_0 A^T = \Sigma_1

Geometric structure

\operatorname{GL}(n)/\operatorname{O}(\Sigma_0,n)\\ \cong \operatorname{P}(n)

\Sigma_0

\Sigma_1

\text{Id}

\operatorname{P}(n)

\operatorname{GL}(n)

In brief: Gradient flow horizontally or vertically

Vertical matrix flow

\Sigma_0

\Sigma_1

\text{Id}

E.J, K. Modin, Convergence of the vertical gradient flow for the Gaussian Monge problem J. Comput. Dyn. (accepted), 2023

How to prove convergence?

Idea: Show \(\frac{\mathrm d} {\mathrm d t} J \to 0\), and that this means we hit polar cone

\dot B = \Omega B, B(0) =A \\ \Sigma_1 \Omega + \Omega_1 \Sigma = 2\Sigma_1 (B^{-1}-B^{-T}) \,

Questions!

Convergence rate in linear case?

Random matrices with known factorization \(A = PU\), distance to \(B\) from \(P\).

Interesting for other, similar matrix flows.

Questions!

Gaussian case: pre-study for more work into the gradient flows in the infinite-dimensional case?

Existence? Convergence to minimizer?
Discretization?

\dot \eta+ 2\eta^{-1}+2\operatorname{Id}= -\nabla p \circ \eta \\ \nabla \cdot \rho_1 \nabla p = \nabla\cdot \rho_1 (\text{id}-\eta^{-1})