ADMM and GCS
Meeting with Russ and Pablo
October 23rd 2023
Bernhard Paus Graesdal
Motivation
- Solve larger GCS instances
- Leverage parallel computation
- Custom GCS solver
- (not depend on Mosek)
- Amazon told me today that not having Mosek has been a bottleneck for them and using GCS
ADMM Background
\begin{aligned}
\min \quad& f(x) + g(z) \\
\text{s.t.}
\quad & Ax + Bz = c
\end{aligned}
- General ADMM problem looks like this:
- \(f: \R^n \rightarrow \R\) and \(g: \R^m \rightarrow \R\) are convex functions
- \(A \in \R^{l \times n}\) and \(B \in \R^{l \times m}\) are matrix coefficients
- \(x \in \R^n\) and \(z \in \R^m\) are optimization variables
ADMM Background
\begin{aligned}
\min \quad& f(x) + g(z) \\
\text{s.t.}
\quad & Ax + Bz = c
\end{aligned}
- Form the Augmented Lagrangian
\begin{aligned}
L_\rho(x, z, y) = f(x) + g(z) &+ y^\intercal(Ax + Bz - c)
+ \frac{1}{2}\rho \| Ax + Bz - c\|_2^2
\end{aligned}
- \(\rho > 0, \ \rho \in \R\) can be seen as a penalty parameter
- \(\|\cdot\|_2\) denotes the Euclidean norm
- \(y \in \R^l\) is a vector of dual variables
ADMM Background
\begin{aligned}
\min \quad& f(x) + g(z) \\
\text{s.t.}
\quad & Ax + Bz = c
\end{aligned}
- Initialization: Choose initial guesses \(x^{(0)}\), \(z^{(0)}\), and \(y^{(0)}\).
- Iteration: For \(k = 0, 1, 2, \ldots\) until convergence, update:
\begin{aligned}
L_\rho(x, z, y) = f(x) + g(z) &+ y^\intercal(Ax + Bz - c)
+ \frac{1}{2}\rho \| Ax + Bz - c\|_2^2
\end{aligned}
\begin{aligned}
x^{(k+1)} & := \argmin_x L_\rho(x, z^{(k)}, y^{(k)}) \\
z^{(k+1)} & := \argmin_z L_\rho(x^{(k+1)}, z, y^{(k)}) \\
y^{(k+1)} & := y^{(k)} + \rho \left( Ax^{(k+1)} + Bz^{(k+1)} - c \right)
\end{aligned}
3. Convergence Check: Stop the iterations if the stopping criteria are satisfied.
Scaled price variables
2. Iteration: For \(k = 0, 1, 2, \ldots\) until convergence, update:
\begin{aligned}
L_\rho(x, z, u) =& f(x) + g(z) - (\rho/2) \|u\|_2^2
+ (\rho/2) \| Ax + Bz - c + u\|_2^2
\end{aligned}
\begin{aligned}
x^{(k+1)} & := \argmin_x f(x) + (\rho/2) \| Ax + Bz^{(k)} - c + u^{(k)}) \|_2^2 \\
z^{(k+1)} & := \argmin_z g(z) + (\rho/2) \| Ax^{(k+1)} + Bz - c + u^{(k)}) \|_2^2 \\
u^{(k+1)} & := u^{(k)} + Ax^{(k+1)} + Bz^{(k+1)} - c
\end{aligned}
- Introduce a scaled dual variable \( u:= (1/\rho) y\), and reformulate the Augmented Lagrangian (using completion of squares) as:
- Gives simpler update steps:
ADMM & Constrained Convex Optimization
\begin{aligned}
x^{(k+1)} & := \argmin_x \left( f(x) + (\rho/2) \| x - z^{(k)} + u^{(k)} \|_2^2 \right) \\
z^{(k+1)} & := \Pi_\mathcal{C}( x^{(k+1)} + u^{(k)}) \\
u^{(k+1)} & := u^{(k)} + x^{(k+1)} - z^{(k+1)}
\end{aligned}
- Update steps now take the form:
\begin{aligned}
\min \quad& f(x) \\
\text{s.t.}
\quad & x \in \mathcal{C}
\end{aligned}
\begin{aligned}
\min \quad& f(x) + \tilde{I}_\mathcal{C}(z) \\
\text{s.t.}
\quad & x - z = 0
\end{aligned}
\iff
- Put the problem into ADMM form:
\begin{aligned}
\min \quad& f(x) + g(z) \\
\text{s.t.}
\quad & Ax + Bz = c
\end{aligned}
ADMM & Global Variable Consensus
\begin{aligned}
x^{(k+1)}_i & := \argmin_{x_i} \left( f_i(x_i) + (\rho/2) \| x_i - z^{(k)} + u^{(k)} \|_2^2 \right) \\
z^{(k+1)} & := \frac{1}{N} \sum^N_{i=1} (x_i^{(k+1)} + u^{(k)}_i ) \\
u^{(k+1)}_i & := u^{(k)}_i + x_i^{(k+1)} - z_i^{(k+1)}
\end{aligned}
- Update steps now take the form:
\begin{aligned}
\min \quad& \sum_{i=1}^N f_i(x)
\end{aligned}
\begin{aligned}
\min \quad& \sum_{i=1}^N f_i(x_i) \\
\text{s.t.}
\quad & x_i - z = 0, \quad i = 1, \ldots, N
\end{aligned}
\iff
- Put the problem into ADMM form:
\begin{aligned}
\min \quad& f(x) + g(z) \\
\text{s.t.}
\quad & Ax + Bz = c
\end{aligned}
ADMM & Conic Programming
- Put the problem into ADMM form:
\begin{aligned}
\min \quad& f(x) + g(z) \\
\text{s.t.}
\quad & Ax + Bz = c
\end{aligned}
\( Ax \in K \iff Ax = s_1, \, s_2 \in K, \, s_1 = s_2 \)
- Put the \( s_1 = s_2 \) into the cost
Two paths:
ADMM and GCS
- Solve the MICP directly
- Easier to start with
- Things decompose more nicely
- No (or little) convergence guarantees?
- Solve the convex relaxation
1. Solve the MICP directly
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
\min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e l_e(x_u, x_v) \\
\text{s.t.}
\quad & y_v := (y_e)_{e \in \mathcal{E}_v} \\
\quad & y_v \in \Phi_v, \quad \forall v \in \mathcal{V} \\
\quad & x_v \in \mathcal{X}_v, \quad \forall v \in \mathcal{V} \\
\quad & y_e \in \set{0,1} \quad \forall e \in \mathcal{E}
\end{aligned}
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
\Phi_v :=
\left\{
y_v \geq 0
\,
\middle|
\,
\sum_{e \in \mathcal{E}_v^\text{in}} y_e + \delta_{sv} =
\sum_{e \in \mathcal{E}_v^\text{in}} y_e + \delta_{tv} \leq 1
\right\}
\end{aligned}
- Bi-convex in \( y_e, \, e \in \mathcal{E} \) and \( x_v, \,v \in \mathcal {V} \)
- Flow polytope:
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
\min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e l_e(x_{eu}, x_{ev}) \\
\text{s.t.}
\quad & x_{eu} \in \mathcal{X}_u, \; x_{ev} \in \mathcal{X}_v, \\
\quad & x_{eu} - x_u = 0, \\
\quad & x_{ev} - x_v = 0, \quad \forall e = (u,v) \in \mathcal{E} \\
\quad & (y_e)_{e \in \mathcal{E}_v} \in \Phi_v, \quad \forall v \in \mathcal{V}
\end{aligned}
- For each edge \(e = (u, v)\) introduce the local variables \(x_{eu}\) and \(x_{ev}\). We can now rewrite the SPP GCS problem as:
"Consensus constraints"
\( \leftarrow \)
Cost now decomposes independently over edges
\( \leftarrow \)
1. Solve the MICP directly
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
\min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e l_e(x_{eu}, x_{ev}) \\
\text{s.t.}
\quad & x_{eu} \in \mathcal{X}_u, \; x_{ev} \in \mathcal{X}_v, \\
\quad & x_{eu} - x_u = 0, \\
\quad & x_{ev} - x_v = 0, \quad \forall e = (u,v) \in \mathcal{E} \\
\quad & (y_e)_{e \in \mathcal{E}_v} \in \Phi_v, \quad \forall v \in \mathcal{V}
\end{aligned}
1. Solve the MICP directly
Three blocks of variables:
- Consensus variables: \( x_v, \, v \in \mathcal{V} \)
- Local variables: \((x_{eu}, x_{ev}), \, e = (u,v) \in \mathcal{E} \)
- Flow variables: \( y_e, \, e \in \mathcal{E} \)
The problem is bi-convex in these
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
L_\rho(\cdot) = \sum_{e = (u,v) \in \mathcal{E}}
\Bigl( y_e l_e(x_{eu}, x_{ev})
+ (\rho/2) \| G(x_u, x_v, x_{eu}, x_{ev}) + \lambda_e\|_2^2
- (\rho/2) \| \lambda_e \|_2^2 \Bigr)
\end{aligned}
- Form the lagrangian by relaxing "only" the consensus constraints
- \( \lambda_e = (\lambda_{eu}, \lambda_{ev}) \in \R^{|\mathcal{X}_u| + |\mathcal{X}_v|} \)
- \(G(x_u, x_v, x_{eu}, x_{ev}) = (x_{eu} - x_u, \ x_{ev} - x_v) \in \R^{ | \mathcal{X}_u | + | \mathcal{X}_v | }\)
- (To be precise we are really doing the whole machinery with indicator functions over sets)
1. Solve the MICP directly
\begin{aligned}
\min \quad& y_e^{(k)} l_e(x_{eu}, x_{ev}) + (\rho / 2) \| G(x_u^{(k)}, x_v^{(k)}, x_{eu}, x_{ev}) + \lambda_e^{(k)} \|_2^2 \\
\text{s.t.} \quad & x_{eu} \in \mathcal{X}_u \\
\quad & x_{ev} \in \mathcal{X}_v
\end{aligned}
Update steps:
1. Solve the MICP directly
1) Local Update
- For each edge \( e = (u,v) \in \mathcal{E} \) we update the local variables while keeping the other variables fixed.
- The update step for \( (x_{eu}^{(k+1)}, x_{ev}^{(k+1)}) \) is the solution to the following constrained convex optimization problem:
\begin{aligned}
x_{v}^{(k+1)} :=
\frac{1}{ | \mathcal{E}_v | }
\sum_{e \in \mathcal{E}_v}
\left(
x_{ev}^{(k+1)}
+ \lambda_{ev}^{(k)}
\right)
\end{aligned}
Update steps:
1. Solve the MICP directly
2) Consensus Update
- For each vertex \( v \in \mathcal{V} \) we update the consensus variable \( x_v \) as:
- (The minimization is just an unconstrained quadratic optimization, so this falls right out of that)
\begin{aligned}
\min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e c_e \\
\text{s.t.}
\quad & (y_e)_{e \in \mathcal{E}_v} \in \Phi_v, \quad \forall v \in \mathcal{V}
\end{aligned}
Update steps:
1. Solve the MICP directly
3) Discrete SPP update
- Minimizing only wrt flows gives us a discrete SPP:
- where \( c_e := l_e(x_{eu}^{(k+1)}, x_{ev}^{(k+1)}) \) is a constant
- NOTE: We are not solving SPP over consensus variables
\begin{aligned}
\lambda^{(k+1)}_e & := \lambda^{(k)}_e + G(x_u^{(k+1)}, x_v^{(k+1)}, x_{eu}^{(k+1)}, x_{ev}^{(k+1)})
\end{aligned}
Update steps:
1. Solve the MICP directly
4) Price update
- For each edge \( e \in \mathcal{E} \), the update step for the dual variables is
- \(G(x_u, x_v, x_{eu}, x_{ev}) = (x_{eu} - x_u, \ x_{ev} - x_v) \in \R^{ | \mathcal{X}_u | + | \mathcal{X}_v | }\)
- Can be interpreted as the integral of the error/constraint violation
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
\min &\quad \sum_{e \in \mathcal{E}} \tilde{l}_e(z_e, z_e', y_e) \\
\text{s.t.}
\quad & \sum_{e\in \mathcal{E}^\text{out}_s} y_e = 1,
\sum_{e\in \mathcal{E}^\text{in}_t} y_e = 1 \\
\quad & \sum_{e\in \mathcal{E}^\text{in}_v} y_e \leq 1, \quad \forall v \in \mathcal{V} - \set{s,t} \\
\quad & \sum_{e\in \mathcal{E}^\text{in}_v} (z_e', y_e) =
\sum_{e\in \mathcal{E}^\text{out}_v} (z_e, y_e)
\quad \forall v \in \mathcal{V} - \set{s,t} \\
\quad & (z_e,y_e) \in \tilde{\mathcal{X}}_u \\
\quad & (z_e',y_e) \in \tilde{\mathcal{X}}_v,
\quad \forall e = (u,v) \in \mathcal{E}
\end{aligned}
2. Solve the convex relaxation
- Convex relaxation of GCS problem:
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
\min &\quad \sum_{v \in \mathcal{V}} \left( \sum_{e \in \mathcal{E}_v} \tilde{l}_e(z_{ve}, z_{ve}', y_{ve}) \right) \\
\text{s.t.}
\quad & \sum_{e\in \mathcal{E}^\text{out}_s} y_{se} = 1,
\sum_{e\in \mathcal{E}^\text{in}_t} y_{te} = 1 \\
\quad & \sum_{e\in \mathcal{E}^\text{in}_v} y_{ve} \leq 1, \quad \forall v \in \mathcal{V} - \set{s,t} \\
\quad & \sum_{e\in \mathcal{E}^\text{in}_v} (z_{ve}', y_{ve}) =
\sum_{e\in \mathcal{E}^\text{out}_v} (z_{ve}, y_{ve})
\quad \forall v \in \mathcal{V} - \set{s,t} \\
\quad & (z_{ve},y_{ve}) \in \tilde{\mathcal{X}}_u \\
\quad & (z_{ve}',y_{ve}) \in \tilde{\mathcal{X}}_w,
\quad \forall v \in \mathcal{V}, \, \forall e = (u,w) \in \mathcal{E}_v \\
\quad &
(z_{ve}, z_{ve}', y_{ve}) =
(z_{e}, z_{e}', y_{e})
\quad \forall v \in \mathcal{V}, \, \forall e \in \mathcal{E}_v
\end{aligned}
2. Solve the convex relaxation
- Decompose the problem over vertices.
- Reformulate as:
- For each vertex \( v \in \mathcal{V} \) we have defined auxilliary variables (local variables) as \( z_{ve}, z_{ve}', y_{ve}, \, \forall e \in \mathcal{E}_v\).
- Constraints are now only enforced over the local variables
- Coupling constraints ensure that the two programs are equivalent.
\newcommand{\set}[1]{\left\{#1\right\}}
\begin{aligned}
L_\rho(\cdot) =
\sum_{v \in \mathcal{V}} \Bigl( \sum_{e \in \mathcal{E}_v} & \tilde{l}_e(z_{ve}, z_{ve}', y_{ve}) \\
&+ (\rho/2) \| z_{ve} - z_e + \lambda_{ve} \|_2^2 \\
&+ (\rho/2) \| z_{ve}' - z_e' + \lambda_{ve}' \|_2^2 \\
&+ (\sigma/2) \| y_{ve} - y_e + \mu_{ve} \|_2^2 \\
&- (\rho/2) \| \lambda_{ve} \|_2^2
- (\rho/2) \| \lambda_{ve}' \|_2^2
- (\sigma/2) \| \mu_{ve} \|_2^2 \Bigr)
\end{aligned}
2. Solve the convex relaxation
- Form Augmented Lagrangian:
\begin{aligned}
\min \quad&
\sum_{e \in \mathcal{E}_v} \Bigl( \tilde{l}_e(z_{ve}, z_{ve}', y_{ve}) \\
\quad \quad & + (\rho/2) \| z_{ve} - z_e + \lambda_{ve} \|_2^2 \\
\quad \quad & + (\rho/2) \| z_{ve}' - z_e' + \lambda_{ve}' \|_2^2 \\
\quad \quad & + (\sigma/2) \| y_{ve} - y_e + \mu_{ve} \|_2^2 \Bigr) \\
\text{s.t.}
\quad & \sum_{e\in \mathcal{E}^\text{in}_v} y_{ve} \leq 1 \\
\quad & \sum_{e\in \mathcal{E}^\text{in}_v} (z_{ve}', y_{ve}) =
\sum_{e\in \mathcal{E}^\text{out}_v} (z_{ve}, y_{ve}) \\
\quad & (z_{ve},y_{ve}) \in \tilde{\mathcal{X}}_u \\
\quad & (z_{ve}',y_{ve}) \in \tilde{\mathcal{X}}_w,
\quad \forall e = (u,w) \in \mathcal{E}_v
\end{aligned}
Update steps:
1) Local Update
- For each vertex \( v \in \mathcal{V} \) we update the local variables while keeping the other variables fixed.
2. Solve the convex relaxation
\begin{aligned}
z_e^{(k+1)} := \argmin
\| z_{ue} - z_e^{(k)} + \lambda_{ue} \|^2_2 +
\| z_{we} - z_e^{(k)} + \lambda_{we} \|^2_2 \\
z_e^{'(k+1) }:= \argmin
\| z_{ue}' - z_e^{'(k)} + \lambda_{ue}' \|^2_2 +
\| z_{we}' - z_e^{'(k)} + \lambda_{we}' \|^2_2 \\
y_e^{(k+1)} := \argmin
\| y_{ue} - y_e^{(k)} + \mu_{ue} \|^2_2 +
\| y_{we} - y_e^{(k)} + \mu_{we} \|^2_2 \\
\end{aligned}
Update steps:
2) Consensus Update
- For each edge \( e = (u,w) \in \mathcal{E} \) we solve a unconstrained QP (closed form solution)
2. Solve the convex relaxation
\begin{aligned}
\lambda^{(k+1)}_{ve} & := \lambda^{(k)}_{ve} + z_{ue}^{(k+1)} - z_e^{(k+1)} \\
\lambda^{'(k+1)}_{ve} & := \lambda^{'(k)}_{ve} + z_{ue}^{'(k+1)} - z_e^{'(k+1)} \\
\mu^{(k+1)}_{ve} & := \mu^{(k)}_{ve} + y_{ue}^{(k+1)} - z_e^{(k+1)} \\
\end{aligned}
Update steps:
3) Price update
- For each vertex \( v \in \mathcal{V}\) and edge \(e \in \mathcal{E}_v\) the update steps for the dual variables are:
2. Solve the convex relaxation
2. Solve the convex relaxation
Have not implemented this yet...
ADMM and GCS
By Bernhard Paus Græsdal
ADMM and GCS
- 199