Flavien Léger
NNCC spaces in optimization
joint works with Pierre-Cyril Aubin-Frankowski &
Gabriele Todeschi and François-Xavier Vialard
Overview
Minimize \[\mathcal{E}\colon X\to\mathbb{R}\cup\{+\infty\}\]
using a function \(c(x,y)\) as a “movement limiter”
⏵ \(X\) : possibly infinite-dimensional
⏵ \(\mathcal{E},c\) : possibly nonsmooth
(Jacobs–Lee–L ‘21)
Implicit and explicit methods with
a cost \(c(x,y)\)
1.
2.
Evolution variational inequalities (EVIs)
3.
NNCC spaces
Implicit method with cost \(c(x,y)\)
\((X,d)\) metric space
\(X\) an arbitrary set
Proximal point method
cost function \(c\colon X\times X\to\mathbb{R}\)
\(c(x,y)\geq 0\), \(c(x,x)=0\)
Implicit method with cost \(c(x,y)\)
Remarks/
Motivations
- Tailored \(c(x,y)\)
- Regularizing operator
- Gradient flows “\(\dot x(t)=-\nabla \mathcal{E}(x(t))\)” (\(\tau\to0\))
⏵ Define gradient flows in nonsmooth settings
⏵ \(c(x,y)\) as a proxy for \(d^2(x,y)\)
(Ambrosio–Gigli–Savaré ’05)
(Rankin–Wong ’24)
(AM)
Explicit method with cost \(c(x,y)\)
How to do an explicit method with cost \(c(x,y)\) ?
\(Y\) another set
\(X\) an arbitrary set
c-concavity
Definition. \(\mathcal{E}\) is c-concave if there exists \(h\colon Y\to \mathbb{R}\cup\{+\infty\}\) s.t. \[\mathcal{E}(x)=\inf_{y\in Y}c(x,y)+h(y).\]
Smallest such \(h\) is the c-transform \(\mathcal{E}^c(y)=\sup_{x\in X} \mathcal{E}(x)-c(x,y).\)
\(\mathcal{E}\) is \(c\)-concave
\(\mathcal{E}\) is not \(c\)-concave
\(c(x,y)=\frac{L}{2}\lVert x-y\rVert^2\)
\(\mathcal{E}\) is \(c\)-concave \(\iff \nabla^2 \mathcal{E}\preccurlyeq L\, I_{d\times d}\)
Example. \(X=Y=\mathbb{R}^d\)
\(c(x,y)+\mathcal{E}^c(y)\)
\(\mathcal{E}\)
\(\mathcal{E}\)
Explicit algorithm
Explicit algorithm
Suppose \(\mathcal{E}\) is c-concave:
Nonsmooth settings
Smooth settings
\(X,Y\) finite-dimensional manifolds,
twisted \(c\in C^1(X\times Y)\),
\(\mathcal{E}\in C^1(X)\)
\(\mathcal{E}\)
(“Gradient descent with a general cost” L–Aubin-Frankowski ‘23)
Recap
Explicit: assume \(\mathcal{E}\) is c-concave
Alternating Minimization (AM) of \(\phi\)
Implicit
Implicit+Explicit (forward–backward): \(\mathcal{E}(x)=\mathcal{E}_1(x)+\mathcal{E}_2(x)\)
Assume \(\mathcal{E}_2\) is c-concave
Implicit and explicit methods
with a cost \(c(x,y)\)
1.
2.
Evolution variational inequalities (EVIs)
3.
NNCC spaces
Evolution Variational Inequalities (EVIs)
Definition. Let \(\lambda\in[0,1)\). We say that \((x_n,y_n)_n\) satisfy the EVI if \(\forall n\geq 0\),
\[\forall x\in X,y\in Y,\quad(1-\lambda)\phi(x_n,y_n)+\phi(x,y_{n+1})\leq \phi(x,y)+(1-\lambda)\phi(x,y_n).\]
\(X,Y\) two arbitrary sets,
\(\phi\colon X\times Y\to\mathbb{R}\cup\{+\infty\}\) proper
⏵ Nonsmooth, intrinsic
⏵ Condition on \(\phi\) and on the choice of iterates
T H E O R E M (L–Aubin-Frankowski '23)
Background on EVIs
EVI \((\lambda=0)\)
Euclidean
Consider implicit method
\(\phi(x,y)\): extends the five-point property of Csiszár–Tusnády ’84
\((X,d)\) non-positively curved, Mayer/Jost
Ambrosio–Gigli–Savaré
\(\mathcal{E}\) convex on geodesics
\(\mathcal{E}\) convex
\(\mathcal{E}\) convex on curves \(x(t)\) such that \(d^2(x,x_{n-1})\) is \(1\)-convex, i.e. \(t\mapsto d^2(x(t),x_{n-1})-t^2 \,d^2(x(1),x(0))\) is convex
Convergence rates from EVIs
Suppose the EVI holds:
With \(\lambda=0\) then
\[\phi(x_n,y_n)\leq \phi(x,y)+\frac{\phi(x,y_0)-\phi(x_0,y_0)}{n}\]
With \(\lambda>0\) then
\[\phi(x_n,y_n)\leq \phi(x,y)+\frac{\lambda[\phi(x,y_0)-\phi(x_0,y_0)]}{\Lambda^n-1},\]
\(\Lambda\coloneqq(1-\lambda)^{-1}>1\).
T H E O R E M (L–Aubin-Frankowski '23)
Implicit and explicit methods
with a cost \(c(x,y)\)
1.
2.
Evolution variational inequalities (EVIs)
3.
NNCC spaces
NNCC spaces
D E F I N I T I O N (L–Todeschi–Vialard '24)
\((X\times Y,c)\) is an NNCC space if for each \((x_0,x_1,\bar y)\in X\times X\times Y\), there exists a path \(x(\cdot)\) from \(x_0\) to \(x_1\) such that \(\forall y\in Y\), \[c(x(t),\bar y)-c(x(t),y)\leq (1-t)[c(x_0,\bar y)-c(x_0,y)]+t[c(x_1,\bar y)-c(x_1,y)].\]
\((x(t),\bar y)\) is called a generalized c-segment.
\(X, Y\) two arbitrary sets, \(c\colon X\times Y\to\mathbb{R}\cup\{+\infty\}\).
(Think: \(t\mapsto c(x(t),\bar y)-c(x(t),y)\) is convex)
NNCC spaces
History. Variant of the Ma–Trudinger–Wang (MTW) condition studied by Kim and McCann.
Original setting is smooth and finite-dimensional \(c\in C^4(X\times Y)\).
Ma, Trudinger, Wang, Loeper, Kim, McCann, Villani, Figalli, Guillen, Kitagawa, Loeper
Basic finite-dim examples:
- \(c(x,y)=\lVert x-y\rVert^2\)
- \(c(x,y)=\) Bregman divergence
- Any smooth reparametrization \(c(x,y)=\lVert F(x)-G(y)\rVert^2\)...
- Sphere
Theory. NNCC preserved by products, projections, pullbacks.
Stable under Gromov–Hausdorff.
EVIs in NNCC spaces
⏵ \((X\times X,c)\) NNCC space
⏵ \(\mathcal{E}(\cdot)-\mu\,c(\cdot,x_n)\) convex on generalized c-segments \((x(t),x_{n-1})\)
Then EVI.
T H E O R E M (L–Todeschi–Vialard '24)
(EVI)
Focus on implicit method \(\phi(x,y)=\mathcal{E}(x)+c(x,y)\)
⏵ Unique argmins
⏵ \(c\) satisfies \(\displaystyle\liminf_{t\to 0}\frac{c(x(t),x(0))}{t}=0.\)
EVIs in NNCC spaces
Then EVI.
T H E O R E M (L–Todeschi–Vialard '24)
(EVI)
Focus on implicit method \(\phi(x,y)=\mathcal{E}(x)+c(x,y)\)
⏵ Unique argmins
⏵ \(c\) satisfies \(\displaystyle\liminf_{t\to 0}\frac{c(x(t),x(0))}{t}=0.\)
⤴
Sublinear (\(\mu=0\)) and linear (\(\mu>0\)) convergence rates
⏵ \((X\times X,c)\) NNCC space
⏵ \(\mathcal{E}(\cdot)-\mu\,c(\cdot,x_n)\) convex on generalized c-segments \((x(t),x_{n-1})\)
Examples of NNCC spaces
\(X\), \(Y\) Polish spaces, \(c\in C(X\times Y)\).
If \((X\times Y,c)\) is an NNCC space then so is \((\mathcal{P}(X)\times \mathcal{P}(Y), \mathcal{T}_c)\).
Corollary: \((\mathcal{P}_2(X)\times \mathcal{P}_2(X), W_2^2)\) is an NNCC space when \(X=\)
\[\mathcal{T}_c(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\int c(x,y)\,d\pi\]
T H E O R E M (L–Todeschi–Vialard '24)
Generalized c-segments \((\mu(t),\nu)\):
⏵ \((T_0,S)\) optimal coupling of \((\mu_0,\nu)\)
⏵ \((T_1,S)\) optimal coupling of \((\mu_1,\nu)\)
⏵ \(\forall \omega\in\Omega\), \(t\mapsto (T_t(\omega),S(\omega))\) c-segment
⏵ \(\mu(t)=(T_t)_\#\mathbb{P}\)
- \(\mathbb{R}^d\)
- the sphere
- Bures–Wasserstein...
\(\nu\)
\(\mu\)
Examples of NNCC spaces
Bures–Wasserstein
Gromov–Wasserstein \(\mathbf{X}=[X,f,\mu]\) and \(\mathbf{Y}=[Y,g,\nu]\)
\[\operatorname{GW}^2(\mathbf{X},\mathbf{Y})=\inf_{\pi\in\Pi(\mu,\nu)}\int\lvert f(x,x')-g(y,y')\rvert^2\,d\pi(x,y)\,d\pi(x',y')\,.\]
Unbalanced OT
Hellinger, Fisher–Rao
Thank you!
(LOL-24 2024-06-17) NNCC spaces in optimization
By Flavien Léger
(LOL-24 2024-06-17) NNCC spaces in optimization
- 191