ADMM and GCS

Meeting with Russ and Pablo

October 23rd 2023

Bernhard Paus Graesdal

Motivation

  • Solve larger GCS instances
  • Leverage parallel computation
  • Custom GCS solver
    • (not depend on Mosek)
    • Amazon told me today that not having Mosek has been a bottleneck for them and using GCS

ADMM Background

\begin{aligned} \min \quad& f(x) + g(z) \\ \text{s.t.} \quad & Ax + Bz = c \end{aligned}
  • General ADMM problem looks like this:
  • \(f: \R^n \rightarrow \R\) and \(g: \R^m \rightarrow \R\) are convex functions
  • \(A \in \R^{l \times n}\) and \(B \in \R^{l \times m}\) are matrix coefficients
  • \(x \in \R^n\) and \(z \in \R^m\) are optimization variables

ADMM Background

\begin{aligned} \min \quad& f(x) + g(z) \\ \text{s.t.} \quad & Ax + Bz = c \end{aligned}
  • Form the Augmented Lagrangian
\begin{aligned} L_\rho(x, z, y) = f(x) + g(z) &+ y^\intercal(Ax + Bz - c) + \frac{1}{2}\rho \| Ax + Bz - c\|_2^2 \end{aligned}
  • \(\rho > 0, \ \rho \in \R\) can be seen as a penalty parameter
  • \(\|\cdot\|_2\) denotes the Euclidean norm
  • \(y \in \R^l\) is a vector of dual variables

ADMM Background

\begin{aligned} \min \quad& f(x) + g(z) \\ \text{s.t.} \quad & Ax + Bz = c \end{aligned}
  1. Initialization: Choose initial guesses \(x^{(0)}\), \(z^{(0)}\), and \(y^{(0)}\).
  2. Iteration: For \(k = 0, 1, 2, \ldots\) until convergence, update:
\begin{aligned} L_\rho(x, z, y) = f(x) + g(z) &+ y^\intercal(Ax + Bz - c) + \frac{1}{2}\rho \| Ax + Bz - c\|_2^2 \end{aligned}
\begin{aligned} x^{(k+1)} & := \argmin_x L_\rho(x, z^{(k)}, y^{(k)}) \\ z^{(k+1)} & := \argmin_z L_\rho(x^{(k+1)}, z, y^{(k)}) \\ y^{(k+1)} & := y^{(k)} + \rho \left( Ax^{(k+1)} + Bz^{(k+1)} - c \right) \end{aligned}

3. Convergence Check: Stop the iterations if the stopping criteria are satisfied.

Scaled price variables

2. Iteration: For \(k = 0, 1, 2, \ldots\) until convergence, update:

\begin{aligned} L_\rho(x, z, u) =& f(x) + g(z) - (\rho/2) \|u\|_2^2 + (\rho/2) \| Ax + Bz - c + u\|_2^2 \end{aligned}
\begin{aligned} x^{(k+1)} & := \argmin_x f(x) + (\rho/2) \| Ax + Bz^{(k)} - c + u^{(k)}) \|_2^2 \\ z^{(k+1)} & := \argmin_z g(z) + (\rho/2) \| Ax^{(k+1)} + Bz - c + u^{(k)}) \|_2^2 \\ u^{(k+1)} & := u^{(k)} + Ax^{(k+1)} + Bz^{(k+1)} - c \end{aligned}
  • Introduce a scaled dual variable \( u:= (1/\rho) y\), and reformulate the Augmented Lagrangian (using completion of squares) as:
  • Gives simpler update steps:

ADMM & Constrained Convex Optimization

\begin{aligned} x^{(k+1)} & := \argmin_x \left( f(x) + (\rho/2) \| x - z^{(k)} + u^{(k)} \|_2^2 \right) \\ z^{(k+1)} & := \Pi_\mathcal{C}( x^{(k+1)} + u^{(k)}) \\ u^{(k+1)} & := u^{(k)} + x^{(k+1)} - z^{(k+1)} \end{aligned}
  • Update steps now take the form:
\begin{aligned} \min \quad& f(x) \\ \text{s.t.} \quad & x \in \mathcal{C} \end{aligned}
\begin{aligned} \min \quad& f(x) + \tilde{I}_\mathcal{C}(z) \\ \text{s.t.} \quad & x - z = 0 \end{aligned}
\iff
  • Put the problem into ADMM form:
\begin{aligned} \min \quad& f(x) + g(z) \\ \text{s.t.} \quad & Ax + Bz = c \end{aligned}

ADMM & Global Variable Consensus

\begin{aligned} x^{(k+1)}_i & := \argmin_{x_i} \left( f_i(x_i) + (\rho/2) \| x_i - z^{(k)} + u^{(k)} \|_2^2 \right) \\ z^{(k+1)} & := \frac{1}{N} \sum^N_{i=1} (x_i^{(k+1)} + u^{(k)}_i ) \\ u^{(k+1)}_i & := u^{(k)}_i + x_i^{(k+1)} - z_i^{(k+1)} \end{aligned}
  • Update steps now take the form:
\begin{aligned} \min \quad& \sum_{i=1}^N f_i(x) \end{aligned}
\begin{aligned} \min \quad& \sum_{i=1}^N f_i(x_i) \\ \text{s.t.} \quad & x_i - z = 0, \quad i = 1, \ldots, N \end{aligned}
\iff
  • Put the problem into ADMM form:
\begin{aligned} \min \quad& f(x) + g(z) \\ \text{s.t.} \quad & Ax + Bz = c \end{aligned}

ADMM & Conic Programming

  • Put the problem into ADMM form:
\begin{aligned} \min \quad& f(x) + g(z) \\ \text{s.t.} \quad & Ax + Bz = c \end{aligned}

\( Ax \in K \iff Ax = s_1, \, s_2 \in K, \, s_1 = s_2 \)

  • Put the \( s_1 = s_2 \) into the cost

Two paths:

ADMM and GCS

  1. Solve the MICP directly
    1. Easier to start with
    2. Things decompose more nicely
    3. No (or little) convergence guarantees?
  2. Solve the convex relaxation

1. Solve the MICP directly

\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} \min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e l_e(x_u, x_v) \\ \text{s.t.} \quad & y_v := (y_e)_{e \in \mathcal{E}_v} \\ \quad & y_v \in \Phi_v, \quad \forall v \in \mathcal{V} \\ \quad & x_v \in \mathcal{X}_v, \quad \forall v \in \mathcal{V} \\ \quad & y_e \in \set{0,1} \quad \forall e \in \mathcal{E} \end{aligned}
\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} \Phi_v := \left\{ y_v \geq 0 \, \middle| \, \sum_{e \in \mathcal{E}_v^\text{in}} y_e + \delta_{sv} = \sum_{e \in \mathcal{E}_v^\text{in}} y_e + \delta_{tv} \leq 1 \right\} \end{aligned}
  • Bi-convex in \( y_e, \, e \in \mathcal{E} \) and \( x_v, \,v \in \mathcal {V} \)
  • Flow polytope:
\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} \min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e l_e(x_{eu}, x_{ev}) \\ \text{s.t.} \quad & x_{eu} \in \mathcal{X}_u, \; x_{ev} \in \mathcal{X}_v, \\ \quad & x_{eu} - x_u = 0, \\ \quad & x_{ev} - x_v = 0, \quad \forall e = (u,v) \in \mathcal{E} \\ \quad & (y_e)_{e \in \mathcal{E}_v} \in \Phi_v, \quad \forall v \in \mathcal{V} \end{aligned}
  • For each edge \(e = (u, v)\) introduce the local variables \(x_{eu}\) and \(x_{ev}\). We can now rewrite the SPP GCS problem as:

"Consensus constraints"

\( \leftarrow \)

Cost now decomposes independently over edges

\( \leftarrow \)

1. Solve the MICP directly

\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} \min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e l_e(x_{eu}, x_{ev}) \\ \text{s.t.} \quad & x_{eu} \in \mathcal{X}_u, \; x_{ev} \in \mathcal{X}_v, \\ \quad & x_{eu} - x_u = 0, \\ \quad & x_{ev} - x_v = 0, \quad \forall e = (u,v) \in \mathcal{E} \\ \quad & (y_e)_{e \in \mathcal{E}_v} \in \Phi_v, \quad \forall v \in \mathcal{V} \end{aligned}

1. Solve the MICP directly

Three blocks of variables:

  1. Consensus variables: \( x_v, \, v \in \mathcal{V} \)
  2. Local variables: \((x_{eu}, x_{ev}), \, e = (u,v) \in \mathcal{E} \)
  3. Flow variables: \( y_e, \, e \in \mathcal{E} \)

The problem is bi-convex in these

\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} L_\rho(\cdot) = \sum_{e = (u,v) \in \mathcal{E}} \Bigl( y_e l_e(x_{eu}, x_{ev}) + (\rho/2) \| G(x_u, x_v, x_{eu}, x_{ev}) + \lambda_e\|_2^2 - (\rho/2) \| \lambda_e \|_2^2 \Bigr) \end{aligned}
  • Form the lagrangian by relaxing "only" the consensus constraints
  • \( \lambda_e = (\lambda_{eu}, \lambda_{ev}) \in \R^{|\mathcal{X}_u| + |\mathcal{X}_v|} \)
  • \(G(x_u, x_v, x_{eu}, x_{ev}) = (x_{eu} - x_u, \ x_{ev} - x_v) \in \R^{ | \mathcal{X}_u | + | \mathcal{X}_v | }\)
  • (To be precise we are really doing the whole machinery with indicator functions over sets)

1. Solve the MICP directly

\begin{aligned} \min \quad& y_e^{(k)} l_e(x_{eu}, x_{ev}) + (\rho / 2) \| G(x_u^{(k)}, x_v^{(k)}, x_{eu}, x_{ev}) + \lambda_e^{(k)} \|_2^2 \\ \text{s.t.} \quad & x_{eu} \in \mathcal{X}_u \\ \quad & x_{ev} \in \mathcal{X}_v \end{aligned}

Update steps:

1. Solve the MICP directly

1) Local Update

  • For each edge \( e = (u,v) \in \mathcal{E} \) we update the local variables while keeping the other variables fixed.
  • The update step for \( (x_{eu}^{(k+1)}, x_{ev}^{(k+1)}) \) is the solution to the following constrained convex optimization problem:
\begin{aligned} x_{v}^{(k+1)} := \frac{1}{ | \mathcal{E}_v | } \sum_{e \in \mathcal{E}_v} \left( x_{ev}^{(k+1)} + \lambda_{ev}^{(k)} \right) \end{aligned}

Update steps:

1. Solve the MICP directly

 2) Consensus Update

  • For each vertex \( v \in \mathcal{V} \) we update the consensus variable \( x_v \) as:
  • (The minimization is just an unconstrained quadratic optimization, so this falls right out of that)
\begin{aligned} \min \quad& \sum_{e := (u,v) \in \mathcal{E}} y_e c_e \\ \text{s.t.} \quad & (y_e)_{e \in \mathcal{E}_v} \in \Phi_v, \quad \forall v \in \mathcal{V} \end{aligned}

Update steps:

1. Solve the MICP directly

3) Discrete SPP update

  • Minimizing only wrt flows gives us a discrete SPP:
  • where \( c_e := l_e(x_{eu}^{(k+1)}, x_{ev}^{(k+1)}) \) is a constant
  • NOTE: We are not solving SPP over consensus variables
\begin{aligned} \lambda^{(k+1)}_e & := \lambda^{(k)}_e + G(x_u^{(k+1)}, x_v^{(k+1)}, x_{eu}^{(k+1)}, x_{ev}^{(k+1)}) \end{aligned}

Update steps:

1. Solve the MICP directly

4) Price update

  • For each edge \( e \in \mathcal{E} \), the update step for the dual variables is
  • \(G(x_u, x_v, x_{eu}, x_{ev}) = (x_{eu} - x_u, \ x_{ev} - x_v) \in \R^{ | \mathcal{X}_u | + | \mathcal{X}_v | }\)
  • Can be interpreted as the integral of the error/constraint violation
\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} \min &\quad \sum_{e \in \mathcal{E}} \tilde{l}_e(z_e, z_e', y_e) \\ \text{s.t.} \quad & \sum_{e\in \mathcal{E}^\text{out}_s} y_e = 1, \sum_{e\in \mathcal{E}^\text{in}_t} y_e = 1 \\ \quad & \sum_{e\in \mathcal{E}^\text{in}_v} y_e \leq 1, \quad \forall v \in \mathcal{V} - \set{s,t} \\ \quad & \sum_{e\in \mathcal{E}^\text{in}_v} (z_e', y_e) = \sum_{e\in \mathcal{E}^\text{out}_v} (z_e, y_e) \quad \forall v \in \mathcal{V} - \set{s,t} \\ \quad & (z_e,y_e) \in \tilde{\mathcal{X}}_u \\ \quad & (z_e',y_e) \in \tilde{\mathcal{X}}_v, \quad \forall e = (u,v) \in \mathcal{E} \end{aligned}

2. Solve the convex relaxation

  • Convex relaxation of GCS problem:
\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} \min &\quad \sum_{v \in \mathcal{V}} \left( \sum_{e \in \mathcal{E}_v} \tilde{l}_e(z_{ve}, z_{ve}', y_{ve}) \right) \\ \text{s.t.} \quad & \sum_{e\in \mathcal{E}^\text{out}_s} y_{se} = 1, \sum_{e\in \mathcal{E}^\text{in}_t} y_{te} = 1 \\ \quad & \sum_{e\in \mathcal{E}^\text{in}_v} y_{ve} \leq 1, \quad \forall v \in \mathcal{V} - \set{s,t} \\ \quad & \sum_{e\in \mathcal{E}^\text{in}_v} (z_{ve}', y_{ve}) = \sum_{e\in \mathcal{E}^\text{out}_v} (z_{ve}, y_{ve}) \quad \forall v \in \mathcal{V} - \set{s,t} \\ \quad & (z_{ve},y_{ve}) \in \tilde{\mathcal{X}}_u \\ \quad & (z_{ve}',y_{ve}) \in \tilde{\mathcal{X}}_w, \quad \forall v \in \mathcal{V}, \, \forall e = (u,w) \in \mathcal{E}_v \\ \quad & (z_{ve}, z_{ve}', y_{ve}) = (z_{e}, z_{e}', y_{e}) \quad \forall v \in \mathcal{V}, \, \forall e \in \mathcal{E}_v \end{aligned}

2. Solve the convex relaxation

  • Decompose the problem over vertices.
  • Reformulate as:
  • For each vertex \( v \in \mathcal{V} \) we have defined auxilliary variables (local variables) as \( z_{ve}, z_{ve}', y_{ve}, \, \forall e \in \mathcal{E}_v\).
  • Constraints are now only enforced over the local variables
  • Coupling constraints ensure that the two programs are equivalent.
\newcommand{\set}[1]{\left\{#1\right\}} \begin{aligned} L_\rho(\cdot) = \sum_{v \in \mathcal{V}} \Bigl( \sum_{e \in \mathcal{E}_v} & \tilde{l}_e(z_{ve}, z_{ve}', y_{ve}) \\ &+ (\rho/2) \| z_{ve} - z_e + \lambda_{ve} \|_2^2 \\ &+ (\rho/2) \| z_{ve}' - z_e' + \lambda_{ve}' \|_2^2 \\ &+ (\sigma/2) \| y_{ve} - y_e + \mu_{ve} \|_2^2 \\ &- (\rho/2) \| \lambda_{ve} \|_2^2 - (\rho/2) \| \lambda_{ve}' \|_2^2 - (\sigma/2) \| \mu_{ve} \|_2^2 \Bigr) \end{aligned}

2. Solve the convex relaxation

  • Form Augmented Lagrangian:
\begin{aligned} \min \quad& \sum_{e \in \mathcal{E}_v} \Bigl( \tilde{l}_e(z_{ve}, z_{ve}', y_{ve}) \\ \quad \quad & + (\rho/2) \| z_{ve} - z_e + \lambda_{ve} \|_2^2 \\ \quad \quad & + (\rho/2) \| z_{ve}' - z_e' + \lambda_{ve}' \|_2^2 \\ \quad \quad & + (\sigma/2) \| y_{ve} - y_e + \mu_{ve} \|_2^2 \Bigr) \\ \text{s.t.} \quad & \sum_{e\in \mathcal{E}^\text{in}_v} y_{ve} \leq 1 \\ \quad & \sum_{e\in \mathcal{E}^\text{in}_v} (z_{ve}', y_{ve}) = \sum_{e\in \mathcal{E}^\text{out}_v} (z_{ve}, y_{ve}) \\ \quad & (z_{ve},y_{ve}) \in \tilde{\mathcal{X}}_u \\ \quad & (z_{ve}',y_{ve}) \in \tilde{\mathcal{X}}_w, \quad \forall e = (u,w) \in \mathcal{E}_v \end{aligned}

Update steps:

1) Local Update

  • For each vertex \( v \in \mathcal{V} \) we update the local variables while keeping the other variables fixed.

2. Solve the convex relaxation

\begin{aligned} z_e^{(k+1)} := \argmin \| z_{ue} - z_e^{(k)} + \lambda_{ue} \|^2_2 + \| z_{we} - z_e^{(k)} + \lambda_{we} \|^2_2 \\ z_e^{'(k+1) }:= \argmin \| z_{ue}' - z_e^{'(k)} + \lambda_{ue}' \|^2_2 + \| z_{we}' - z_e^{'(k)} + \lambda_{we}' \|^2_2 \\ y_e^{(k+1)} := \argmin \| y_{ue} - y_e^{(k)} + \mu_{ue} \|^2_2 + \| y_{we} - y_e^{(k)} + \mu_{we} \|^2_2 \\ \end{aligned}

Update steps:

2) Consensus Update

  • For each edge \( e = (u,w) \in \mathcal{E} \) we solve a unconstrained QP (closed form solution)

2. Solve the convex relaxation

\begin{aligned} \lambda^{(k+1)}_{ve} & := \lambda^{(k)}_{ve} + z_{ue}^{(k+1)} - z_e^{(k+1)} \\ \lambda^{'(k+1)}_{ve} & := \lambda^{'(k)}_{ve} + z_{ue}^{'(k+1)} - z_e^{'(k+1)} \\ \mu^{(k+1)}_{ve} & := \mu^{(k)}_{ve} + y_{ue}^{(k+1)} - z_e^{(k+1)} \\ \end{aligned}

Update steps:

3) Price update

  •  For each vertex \( v \in \mathcal{V}\) and edge \(e \in \mathcal{E}_v\) the update steps for the dual variables are:

2. Solve the convex relaxation

2. Solve the convex relaxation

Have not implemented this yet...