Flavien Léger

NNCC spaces in optimization

joint works with Pierre-Cyril Aubin-Frankowski &

Gabriele Todeschi and François-Xavier Vialard

Overview

Minimize \[\mathcal{E}\colon X\to\mathbb{R}\cup\{+\infty\}\]

using a function \(c(x,y)\) as a “movement limiter”

⏵ \(X\) : possibly infinite-dimensional

⏵ \(\mathcal{E},c\) : possibly nonsmooth

(Jacobs–Lee–L ‘21)

Implicit and explicit methods with

a cost \(c(x,y)\)

1.

2.

Evolution variational inequalities (EVIs)

3.

NNCC spaces

Implicit method with cost \(c(x,y)\)

x_{n+1}\in\operatorname*{argmin}_{x\in X} \frac{d^2}{2\tau}(x,x_n)+\mathcal{E}(x)

\((X,d)\) metric space

x_{n+1}\in\operatorname*{argmin}_{x\in X} c(x,x_n)+\mathcal{E}(x)

\(X\) an arbitrary set

\mathcal{E}\colon X\to\mathbb{R}\cup\{+\infty\}

\inf_{x\in X} \mathcal{E}(x)

Proximal point method

cost function \(c\colon X\times X\to\mathbb{R}\)

\(c(x,y)\geq 0\), \(c(x,x)=0\)

(\tau>0)

Implicit method with cost \(c(x,y)\)

x_{n+1}\in\operatorname*{argmin}_{x\in X} \mathcal{E}(x)+c(x,x_n)

\mathcal{E}(x)=\inf_{y\in X} \mathcal{E}(x)+c(x,y)\quad\longrightarrow

\left\{\begin{aligned} y_{n+1} &\in \operatorname*{argmin}_{y\in X} \phi(x_n,y)\\ x_{n+1} &\in \operatorname*{argmin}_{x\in X} \phi(x,y_{n+1}) \end{aligned}\right.

\iff

Remarks/

Motivations

Tailored \(c(x,y)\)
Regularizing operator
Gradient flows “\(\dot x(t)=-\nabla \mathcal{E}(x(t))\)” (\(\tau\to0\))

x_{n+1}\in\operatorname*{argmin}_{x\in X} c(x,x_n)+\mathcal{E}(x)

x_{n+1}\in\operatorname*{argmin}_{x\in X} \frac{d^2}{2\tau}(x,x_n)+\mathcal{E}(x)

⏵ Define gradient flows in nonsmooth settings

⏵ \(c(x,y)\) as a proxy for \(d^2(x,y)\)

(Ambrosio–Gigli–Savaré ’05)

(Rankin–Wong ’24)

(AM)

\inf_{x\in X} \mathcal{E}(x)=\inf_{x\in X, \,y\in X} \underbrace{\mathcal{E}(x)+c(x,y)}_{\eqqcolon\phi(x,y)}

Explicit method with cost \(c(x,y)\)

How to do an explicit method with cost \(c(x,y)\) ?

c\colon X\times Y\to\mathbb{R}

\(Y\) another set

\(X\) an arbitrary set

\mathcal{E}\colon X\to\mathbb{R}\cup\{+\infty\}

\inf_{x\in X} \mathcal{E}(x)

c-concavity

Definition. \(\mathcal{E}\) is c-concave if there exists \(h\colon Y\to \mathbb{R}\cup\{+\infty\}\) s.t. \[\mathcal{E}(x)=\inf_{y\in Y}c(x,y)+h(y).\]

Smallest such \(h\) is the c-transform \(\mathcal{E}^c(y)=\sup_{x\in X} \mathcal{E}(x)-c(x,y).\)

\(\mathcal{E}\) is \(c\)-concave

\(\mathcal{E}\) is not \(c\)-concave

\(c(x,y)=\frac{L}{2}\lVert x-y\rVert^2\)

\(\mathcal{E}\) is \(c\)-concave \(\iff \nabla^2 \mathcal{E}\preccurlyeq L\, I_{d\times d}\)

Example. \(X=Y=\mathbb{R}^d\)

\(c(x,y)+\mathcal{E}^c(y)\)

\(\mathcal{E}\)

Explicit algorithm

\longrightarrow\,\inf_{x\in X} \mathcal{E}(x)=\inf_{x\in X, \,y\in Y} \underbrace{c(x,y)+\mathcal{E}^c(y)}_{\eqqcolon\phi(x,y)}

\begin{aligned} y_{n+1} &\in \operatorname*{argmin}_{y\in Y} \phi(x_n,y)\\ x_{n+1} &\in \operatorname*{argmin}_{x\in X} \phi(x,y_{n+1}) \end{aligned}

Explicit algorithm

Suppose \(\mathcal{E}\) is c-concave:

Nonsmooth settings

Smooth settings

\begin{aligned} -\nabla_xc(x_n,y_{n+1})&=-\nabla \mathcal{E}(x_n)\\ \nabla_xc(x_{n+1},y_{n+1})&=0 \end{aligned}

\(X,Y\) finite-dimensional manifolds,

twisted \(c\in C^1(X\times Y)\),

\(\mathcal{E}\in C^1(X)\)

\(\mathcal{E}\)

c(x,y)=\left\{\begin{aligned} &\frac{L}{2}\lVert x-y\rVert^2 &&\longrightarrow\,\text{Gradient descent}\\ &\text{Bregman I} &&\longrightarrow\,\text{Mirror descent}\\ &\text{Bregman II} &&\longrightarrow\,\text{Natural gradient descent} \\ &\text{Riemannian} &&\longrightarrow\,\text{Riemannian gradient descent} \end{aligned} \right.

(“Gradient descent with a general cost” L–Aubin-Frankowski ‘23)

\mathcal{E}(x)=\inf_{y\in Y}c(x,y)+\mathcal{E}^c(y)

Recap

Explicit: assume \(\mathcal{E}\) is c-concave

\phi(x,y)=c(x,y)+\mathcal{E}^c(y)

\mathcal{E}(x)=\inf_{y\in Y} \phi(x,y) \quad \longrightarrow

Alternating Minimization (AM) of \(\phi\)

Implicit

\phi(x,y)=\mathcal{E}(x)+c(x,y)

Implicit+Explicit (forward–backward): \(\mathcal{E}(x)=\mathcal{E}_1(x)+\mathcal{E}_2(x)\)

Assume \(\mathcal{E}_2\) is c-concave

\phi(x,y)=\mathcal{E}_1(x)+c(x,y)+(\mathcal{E}_2)^c(y)

\inf_{x\in X} \mathcal{E}(x)=\inf_{x\in X, \,y\in Y} \phi(x,y)

Implicit and explicit methods

with a cost \(c(x,y)\)

1.

2.

Evolution variational inequalities (EVIs)

3.

NNCC spaces

Evolution Variational Inequalities (EVIs)

Definition. Let \(\lambda\in[0,1)\). We say that \((x_n,y_n)_n\) satisfy the EVI if \(\forall n\geq 0\),

\[\forall x\in X,y\in Y,\quad(1-\lambda)\phi(x_n,y_n)+\phi(x,y_{n+1})\leq \phi(x,y)+(1-\lambda)\phi(x,y_n).\]

x_{n} \in\displaystyle\operatorname*{argmin}_{x\in X} \phi(x,y_{n})

y_{n+1} \in \displaystyle\operatorname*{argmin}_{y\in Y} \phi(x_n,y)

\(X,Y\) two arbitrary sets,

\(\phi\colon X\times Y\to\mathbb{R}\cup\{+\infty\}\) proper

⏵ Nonsmooth, intrinsic

⏵ Condition on \(\phi\) and on the choice of iterates

T H E O R E M (L–Aubin-Frankowski '23)

\text{EVI}(\lambda=0)\implies\phi(x_n,y_n)\leq \phi(x,y)+\frac{\phi(x,y_0)-\phi(x_0,y_0)}{n}

\text{EVI}(\lambda>0)\implies\phi(x_n,y_n)\leq \phi(x,y)+\frac{\lambda[\phi(x,y_0)-\phi(x_0,y_0)]}{\Lambda^n-1}

\Lambda\coloneqq(1-\lambda)^{-1}

Background on EVIs

x_{n} \in\displaystyle\operatorname*{argmin}_{x\in X} \mathcal{E}(x)+\frac{d^2}{2\tau}(x,x_{n-1})

EVI \((\lambda=0)\)

\forall x\in X,\quad \mathcal{E}(x_n)+\frac{1}{2\tau}d^2(x_n,x_{n-1})\leq \mathcal{E}(x)+\frac{1}{2\tau}d^2(x,x_{n-1})-\frac{1}{2\tau}d^2(x,x_n)

Euclidean

Consider implicit method

\(\phi(x,y)\): extends the five-point property of Csiszár–Tusnády ’84

\((X,d)\) non-positively curved, Mayer/Jost

Ambrosio–Gigli–Savaré

\(\mathcal{E}\) convex on geodesics

\(\mathcal{E}\) convex

\(\mathcal{E}\) convex on curves \(x(t)\) such that \(d^2(x,x_{n-1})\) is \(1\)-convex, i.e. \(t\mapsto d^2(x(t),x_{n-1})-t^2 \,d^2(x(1),x(0))\) is convex

Convergence rates from EVIs

Suppose the EVI holds:

With \(\lambda=0\) then

\[\phi(x_n,y_n)\leq \phi(x,y)+\frac{\phi(x,y_0)-\phi(x_0,y_0)}{n}\]

With \(\lambda>0\) then

\[\phi(x_n,y_n)\leq \phi(x,y)+\frac{\lambda[\phi(x,y_0)-\phi(x_0,y_0)]}{\Lambda^n-1},\]

\(\Lambda\coloneqq(1-\lambda)^{-1}>1\).

T H E O R E M (L–Aubin-Frankowski '23)

x_{n} \in\displaystyle\operatorname*{argmin}_{x\in X} \phi(x,y_{n})

y_{n+1} \in \displaystyle\operatorname*{argmin}_{y\in Y} \phi(x_n,y)

Implicit and explicit methods

with a cost \(c(x,y)\)

1.

2.

Evolution variational inequalities (EVIs)

3.

NNCC spaces

D E F I N I T I O N (L–Todeschi–Vialard '24)

\((X\times Y,c)\) is an NNCC space if for each \((x_0,x_1,\bar y)\in X\times X\times Y\), there exists a path \(x(\cdot)\) from \(x_0\) to \(x_1\) such that \(\forall y\in Y\), \[c(x(t),\bar y)-c(x(t),y)\leq (1-t)[c(x_0,\bar y)-c(x_0,y)]+t[c(x_1,\bar y)-c(x_1,y)].\]

\((x(t),\bar y)\) is called a generalized c-segment.

\(X, Y\) two arbitrary sets, \(c\colon X\times Y\to\mathbb{R}\cup\{+\infty\}\).

(Think: \(t\mapsto c(x(t),\bar y)-c(x(t),y)\) is convex)

NNCC spaces

History. Variant of the Ma–Trudinger–Wang (MTW) condition studied by Kim and McCann.

Original setting is smooth and finite-dimensional \(c\in C^4(X\times Y)\).

Ma, Trudinger, Wang, Loeper, Kim, McCann, Villani, Figalli, Guillen, Kitagawa, Loeper

Basic finite-dim examples:

\(c(x,y)=\lVert x-y\rVert^2\)
\(c(x,y)=\) Bregman divergence
Any smooth reparametrization \(c(x,y)=\lVert F(x)-G(y)\rVert^2\)...
Sphere

Theory. NNCC preserved by products, projections, pullbacks.

Stable under Gromov–Hausdorff.

EVIs in NNCC spaces

⏵ \((X\times X,c)\) NNCC space

⏵ \(\mathcal{E}(\cdot)-\mu\,c(\cdot,x_n)\) convex on generalized c-segments \((x(t),x_{n-1})\)

Then EVI.

T H E O R E M (L–Todeschi–Vialard '24)

(EVI)

\mathcal{E}(x_n)+c(x_n,x_{n-1})\leq \mathcal{E}(x)+ c(x,x_{n-1})-(1+\mu)c(x,x_{n})

x_{n} \in\displaystyle\operatorname*{argmin}_{x\in X} \mathcal{E}(x)+c(x,x_{n-1})

Focus on implicit method \(\phi(x,y)=\mathcal{E}(x)+c(x,y)\)

⏵ Unique argmins

⏵ \(c\) satisfies \(\displaystyle\liminf_{t\to 0}\frac{c(x(t),x(0))}{t}=0.\)

1+\mu=(1-\lambda)^{-1}

EVIs in NNCC spaces

Then EVI.

T H E O R E M (L–Todeschi–Vialard '24)

(EVI)

\mathcal{E}(x_n)+c(x_n,x_{n-1})\leq \mathcal{E}(x)+ c(x,x_{n-1})-(1+\mu)c(x,x_{n})

x_{n} \in\displaystyle\operatorname*{argmin}_{x\in X} \mathcal{E}(x)+c(x,x_{n-1})

Focus on implicit method \(\phi(x,y)=\mathcal{E}(x)+c(x,y)\)

⏵ Unique argmins

⏵ \(c\) satisfies \(\displaystyle\liminf_{t\to 0}\frac{c(x(t),x(0))}{t}=0.\)

1+\mu=(1-\lambda)^{-1}

⤴

Sublinear (\(\mu=0\)) and linear (\(\mu>0\)) convergence rates

⏵ \((X\times X,c)\) NNCC space

⏵ \(\mathcal{E}(\cdot)-\mu\,c(\cdot,x_n)\) convex on generalized c-segments \((x(t),x_{n-1})\)

Examples of NNCC spaces

\(X\), \(Y\) Polish spaces, \(c\in C(X\times Y)\).

If \((X\times Y,c)\) is an NNCC space then so is \((\mathcal{P}(X)\times \mathcal{P}(Y), \mathcal{T}_c)\).

Corollary: \((\mathcal{P}_2(X)\times \mathcal{P}_2(X), W_2^2)\) is an NNCC space when \(X=\)

\[\mathcal{T}_c(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\int c(x,y)\,d\pi\]

T H E O R E M (L–Todeschi–Vialard '24)

Generalized c-segments \((\mu(t),\nu)\):

⏵ \((T_0,S)\) optimal coupling of \((\mu_0,\nu)\)

⏵ \((T_1,S)\) optimal coupling of \((\mu_1,\nu)\)

⏵ \(\forall \omega\in\Omega\), \(t\mapsto (T_t(\omega),S(\omega))\) c-segment

⏵ \(\mu(t)=(T_t)_\#\mathbb{P}\)

\(\mathbb{R}^d\)
the sphere
Bures–Wasserstein...

\(\nu\)

\(\mu\)

Examples of NNCC spaces

Bures–Wasserstein

Gromov–Wasserstein \(\mathbf{X}=[X,f,\mu]\) and \(\mathbf{Y}=[Y,g,\nu]\)

\[\operatorname{GW}^2(\mathbf{X},\mathbf{Y})=\inf_{\pi\in\Pi(\mu,\nu)}\int\lvert f(x,x')-g(y,y')\rvert^2\,d\pi(x,y)\,d\pi(x',y')\,.\]

Unbalanced OT

\textbf{X}(t)=[X_0\times X_1,\,\, (1-t)f_0+t\,f_1, \,\, (T_0,T_1)_\#\mathbb{P}].

Hellinger, Fisher–Rao

\operatorname{BW}^2(\Sigma_1,\Sigma_2) = \operatorname{tr}(\Sigma_1) + \operatorname{tr}(\Sigma_2) - 2 \operatorname{tr}\left(\sqrt{\Sigma_1^{1/2}\Sigma_2\Sigma_1^{1/2}}\right)

Thank you!

(LOL-24 2024-06-17) NNCC spaces in optimization

By Flavien Léger

(LOL-24 2024-06-17) NNCC spaces in optimization

10 months ago
261

NNCC spaces in optimization

Overview

1.

2.

3.

Implicit method with cost \(c(x,y)\)

Implicit method with cost \(c(x,y)\)

Explicit method with cost \(c(x,y)\)

c-concavity

Explicit algorithm

Recap

1.

2.

3.

Evolution Variational Inequalities (EVIs)

Background on EVIs

Convergence rates from EVIs

1.

2.

3.

NNCC spaces

NNCC spaces

EVIs in NNCC spaces

EVIs in NNCC spaces

Examples of NNCC spaces

Examples of NNCC spaces

Thank you!

(LOL-24 2024-06-17) NNCC spaces in optimization

More from Flavien Léger