# Overview

Sinkhorn
algorithm

Geometric hydrodynamics

Heat flow
Otto's calculus

Quantum mechanics

## Discrete optimal transport

$$p,q \in \mathbb{R}_+^N$$  with  $$\sum_i p_i=\sum_i q_i = 1$$

find coupling matrix $$\gamma\in \mathbb{R}_+^{N\times N}$$ minimizing

\displaystyle E(\gamma) = \langle C,\gamma \rangle = \sum_{ij} C_{ij}\gamma_{ij}

under the constraint

\displaystyle \sum_i\gamma_{ij} = p_j \qquad \sum_j\gamma_{ij} = q_i

## Sinkhorn algorithm

Relative entropy $$\mathcal H(\gamma\mid K) = \sum_{ij}\gamma_{ij}\log(\gamma_{ij}/K_{ij})$$

Entropic regularization: find coupling matrix $$\gamma\in \mathbb{R}_+^{N\times N}$$ minimizing

\displaystyle E(\gamma) = \langle C,\gamma \rangle + \varepsilon\mathcal H(\gamma\mid 1)

under the same constraint

\displaystyle = \varepsilon\mathcal H(\gamma\mid K)
K_{ij} = \mathrm{e}^{-C_{ij}/\varepsilon}
\gamma = D_a K D_b
\displaystyle a_{k+1} = \frac{q}{K^\top b_k}, \qquad b_k = \frac{p}{K a_k}

(Sinkhorn's theorem)

## Dynamical formulation of OT

\dot S + \frac{1}{2}|\nabla S|^2 = 0
\displaystyle\dot\rho + \mathrm{div}(\rho\nabla S) = 0

[Benamou & Brenier 2000]

Idea: transport map obtained through compressible fluid

Lagrangian for density $$\rho(t,x)$$ and vector field $$v(t,x)$$

under constraints $$\rho(0,\cdot)=\rho_0,\; \rho(1,\cdot)=\rho_1,\;$$

L(\rho ,v) =\int_{[0,1]\times M} \frac{1}{2}|v|^2\rho

Wasserstein-Otto Riemannian metric on $$\mathrm{Diff}(M)$$

$$\dot\rho + \mathrm{div}(\rho v) = 0$$

H(\rho ,m) =\int_{[0,1]\times M} \frac{1}{2\rho}|m|^2

$$\dot\rho + \mathrm{div}(m) = 0$$

Hamiltonian formulation is convex

Hamilton's equations for horizontal momentum $$m=\rho\nabla S$$

\rho \in \mathrm{Dens}(M) = \{\rho\mid \int_M\rho = 1, \rho>0 \}
[S] \in T_\rho^*\mathrm{Dens}(M) = \{[S]\mid [S] = S + \mathbb R \}

transport equation for $$v=\nabla S$$

Hamilton-Jacobi equation

### Summary:

$$L^2$$ optimal transport corresponds to BVP for Hamiltonian system on $$T^*\mathrm{Dens}(M)$$

\dot S + \frac{1}{2}|\nabla S|^2 = 0
\displaystyle\dot\rho + \mathrm{div}(\rho\nabla S) = 0

with bc $$\rho(0,\cdot) = \rho_0$$ and $$\rho(1,\cdot) = \rho_1$$

Hamiltonian is $$H(\rho,S) = \int_M \frac{1}{2}|\nabla S|^2\rho$$

### Aim:

Dynamical formulation corresponding to the Sinkhorn algorihm

$$\Rightarrow$$ Sinkhorn algorithm as space and time discretization of inf. dim. flow equation

$$L^2$$ OT Hamiltonian

T^*\mathrm{Dens}(M) \ni (\rho,[S]) \mapsto \underbrace{\sqrt{\rho}\mathrm{e}^{\mathrm{i}S/\hbar}}_{\psi} \in P C^\infty(M,\mathbb{C}\backslash\{0\})

is a symplectomorphism

\displaystyle H(\rho,[S]) = \frac{1}{2}\int_M |\nabla S|^2\rho
\displaystyle + \frac{\hbar^2}{8}\int_M \frac{|\nabla \rho|^2}{\rho}

perturbed by Fisher information potential

... gives  Schrödinger Hamiltonian $$H(\psi) = \frac{\hbar^2}{2} \Vert \nabla\psi \Vert^2$$

## Imag. Planck's constant $$\hbar = \pm2\mathrm{i}\epsilon$$

$$L^2$$ OT Hamiltonian

T^*\mathrm{Dens}(M) \ni (\rho,[S]) \mapsto \Big( \underbrace{\sqrt{\rho\mathrm{e}^{-S/\epsilon}}}_{\psi_{+}}, \underbrace{\sqrt{\rho\mathrm{e}^{S/\epsilon}}}_{\psi_{-}} \Big)

\displaystyle H(\rho,[S]) = \frac{1}{2}\int_M |\nabla S|^2\rho
\displaystyle - \frac{\epsilon^2}{2}\int_M \frac{|\nabla \rho|^2}{\rho}

perturbed by Fisher information potential

Heat flow forward and

backward in time!

Important observations: $$\psi_+\psi_- = \rho$$ and $$\epsilon\nabla\log(\psi_-/\psi_+) = \nabla S$$

is a symplectomorphism w.r.t. $$D\psi_+\wedge D\psi_-$$

## Perturbed dynamical OT

Minimize perturbed (but still convex) functional

under constraints $$\rho(0,\cdot)=\rho_0,\; \rho(1,\cdot)=\rho_1,\;$$

J(\rho ,m) =\frac{1}{2}\int_{[0,1]\times M} \frac{1}{\rho}\Big( |m|^2 + \epsilon^2 |\nabla \rho|^2\Big)

$$\dot\rho + \mathrm{div}(m) = 0$$

becomes "double heat flow" equation (with $$\bar\psi_-(t,\cdot) = \psi_-(1-t,\cdot)$$)

$\dot\psi_+ = \epsilon\Delta\psi_+, \quad \dot{\bar\psi}_- = \epsilon\Delta\bar\psi_-$

coupled by bc $$\psi_+(0,\cdot)\bar\psi_-(1,\cdot) = \rho_0$$ and $$\psi_+(1,\cdot)\bar\psi_-(0,\cdot) = \rho_1$$

Algebraic formulation in terms of heat kernel $$\mathrm{e}^{\epsilon\Delta}$$

with $$a=\psi_+(0,\cdot)$$ and $$b=\psi_-(1,\cdot)$$

$$a\mathrm{e}^{\epsilon\Delta}b = \rho_0,\quad b\mathrm{e}^{\epsilon\Delta}a = \rho_1$$

$$a_{k+1}\mathrm{e}^{\epsilon\Delta}b_k = \rho_0,\quad b_{k+1}\mathrm{e}^{\epsilon\Delta}a_{k+1} = \rho_1$$

Fixed-point iteration over $$a$$ $$\Rightarrow$$ Sinkhorn algorithm

## Minimization problem for $$\psi_+$$, $$\psi_-$$

Convex functional on $$C^\infty([0,1]\times M,\mathbb{R}_+)^3$$

H(\psi_+\psi_- | \rho) = \int_{[0,1]\times M} \psi_+\psi_- \log\Big(\frac{\psi_+\psi_-}{\rho}\Big)

(entropy of $$\psi_+\psi_-$$ relative to $$\rho$$)

under linear constraints $$\dot\psi_{\pm} = \pm\epsilon\Delta\psi_\pm$$

\displaystyle\dot a = -a \log\left( \frac{a \mathrm{e}^{\varepsilon\Delta}b}{\rho_0} \right)
\displaystyle\dot b = -b \log\left( \frac{b \mathrm{e}^{\varepsilon\Delta}a}{\rho_1} \right)
\alpha = \log a, \quad \beta = \log b

Change of variables

\displaystyle\dot \alpha = -\alpha - \log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\beta} \right) + \log\rho_0
\displaystyle\dot \beta = -\beta - \log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\alpha} \right) + \log\rho_1

## Well-posedness

\displaystyle\dot \alpha = -\alpha - \log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\beta} \right) + \log\rho_0
\displaystyle\dot \beta = -\beta - \log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\alpha} \right) + \log\rho_1

Observation: smooth ODE on Banach space $$X$$

For example ($$M$$ compact):

• $$X=H^s(M)$$ with $$s>d/2$$
• $$X=L^\infty(M)$$

Consequence: local well-posedness automatic (Picard iterations)

Conjecture: global existence due to maximum principle

## Apply splitting + forward Euler

\displaystyle a_{k+1} = \frac{\rho_0^h a_k^{1-h}}{(\mathrm{e}^{\epsilon\Delta}b_k)^h }
\displaystyle b_{k+1} = \frac{\rho_1^h b_k^{1-h}}{(\mathrm{e}^{\epsilon\Delta}a_{k+1})^h }
\displaystyle\dot \beta = -\beta - \log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\alpha} \right) + \log\rho_1
\displaystyle\dot \alpha = -\alpha - \log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\beta} \right) + \log\rho_0
\displaystyle \alpha_{k+1} = (1-h)\alpha_k - h\log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\beta_k} \right) + h\log\rho_0
\displaystyle \beta_{k+1} = (1-h)\beta_k - h\log \left( \mathrm{e}^{\varepsilon\Delta}\mathrm{e}^{\alpha_{k+1}} \right) + h\log\rho_1

Stepsize $$h>0$$

Original variables $$a,b$$ and $$h=1$$ gives

$$h=1$$ gives Sinkhorn before spatial discretization!

## Example on $$T^2$$

I use $$\epsilon = 0.01$$

\rho_0
\rho_1

## Example on $$T^2$$

I use $$\epsilon = 0.01$$

\rho_0
\rho_1

# Results

$$t=0$$

$$t=1$$

# Results

$$t=0$$

$$t=1$$

Transport map is a diffeomorphism

## $$L_\infty$$-error

Can we understand this?

## 101 on linear stability for Euler's method

\dot y = \lambda y
( 1+h\lambda )

Test equation

Euler's method

## 101 on linear stability for Euler's method

\dot y = \lambda y
\lvert 1+h\lambda \rvert

Test equation

Euler's method

h\lambda

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 & -1 \\ 0 & 0 \end{bmatrix} y + \begin{bmatrix}\log\rho_0 \\ 0 \end{bmatrix}
\lvert 1+h\lambda \rvert

Sinkhorn equation with $$\epsilon=0$$ (first part in splitting)

Euler's method

h\lambda

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 &0 \\ 0 &0 \end{bmatrix} y + b
\lvert 1+h\lambda \rvert

Euler's method

h\lambda

Sinkhorn equation with $$\epsilon=0$$ (first part in splitting)

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 &0 \\ 0 &0 \end{bmatrix} y + b
\lvert 1+h\lambda \rvert

Sinkhorn equation with $$\epsilon=0$$

Euler's method

h\lambda

explains initial transient

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 &0 \\ 0 &0 \end{bmatrix} y + b
\lvert 1+h\lambda \rvert

Sinkhorn equation with $$\epsilon=0$$

Euler's method

h\lambda

problem

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 &0 \\ 0 &-\epsilon \end{bmatrix} y + b + \cdots
\lvert 1+h\lambda \rvert

Sinkhorn equation with $$\epsilon>0$$

Euler's method

h\lambda

because of maximum principle

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 &0 \\ 0 &-\epsilon \end{bmatrix} y + b + \cdots
\lvert 1+h\lambda \rvert

Sinkhorn equation with $$\epsilon>0$$

Euler's method

h\lambda

increasing stepsize

## 101 on linear stability for Euler's method

\dot y = \begin{bmatrix}-1 &0 \\ 0 &-\epsilon \end{bmatrix} y + b + \cdots
\lvert 1+h\lambda \rvert

Sinkhorn equation with $$\epsilon>0$$

Euler's method

h\lambda

increasing stepsize

By Klas Modin

# Dynamic formulation of the Sinkhorn algorithm for optimal transport

Presentation given 2019-06 in Duck Creek Village, Utah.

• 1,483