Fall 2025, Prof Sarah Dean
"What we do"
\( \underset{a_0,\dots,a_H }{\min}\) \(\displaystyle\sum_{k=0}^{H-1} c(s_k, a_k) + c_H(s_H)\)
\(\text{s.t.}~~s_0=s_t,~~s_{k+1} = F(s_k, a_k)\)
\(s_k\in\mathcal S_\mathrm{safe},~~s_H\in\mathcal S_H,~~ a_k\in\mathcal A_\mathrm{safe}\)
\([a_0^\star,\dots, a_{H}^\star](s_t) = \arg\)
today: explicitly consider safety constraints
terminal cost and constraints
"Why we do it"
\(s\)
\(s_t\)
\(a_t = a_0^\star(s_t)\)
\( \underset{a_0,\dots,a_H }{\min}\) \(\displaystyle\sum_{k=0}^H c(s_k, a_k)\)
\(\text{s.t.}~~s_0=s_t,~~s_{k+1} = F(s_k, a_k)\)
Fact from last lecture: When costs are quadratic and dynamics are linear, MPC selects an action which depends linearly on the state. $$a_t^{MPC}=K_{MPC}s_t$$
LQR Problem
$$ \min ~~\mathbb E\Big[\sum_{t=0}^{T} s_t^\top Qs_t+ a_t^\top Ra_t\Big]\quad\\ \text{s.t}\quad s_{t+1} = F s_t+ Ga_t+w_t$$
We know that \(a^\star_t = \pi_t^\star(s_t)\) where \(\pi_t^\star(s) = K^\star_t s\) and
MPC Problem
$$ \min ~~\sum_{k=0}^{H-1} s_k^\top Qs_k + a_k^\top Ra_k + s_k^\top Q_H s_k \quad \\ \text{s.t}\quad s_0=s,\quad s_{k+1} = F s_k+ Ga_k $$
MPC Policy \(a_t = a^\star_0(s_t)\) where
\(a^\star_0(s) = K_0s\) and
\(\underbrace{\qquad\qquad}_{J^\pi(s_0)}\)
Dynamic Programming Algorithm:
Stochastic Optimal Control Problem
$$ \min_{\pi_{0:T}} ~~\mathbb E_w\Big[\sum_{k=0}^{T} c(s_k, \pi_k(s_k)) \Big]\quad \text{s.t}\quad s_0~~\text{given},~~s_{k+1} = F(s_k, \pi_k(s_k),w_k) $$
LQR Problem
MPC Problem
MPC Policy \(a_t = a^\star_0(s_t)\) where
\(a^\star_0(s) = K_0s\) and
We know that \(a^\star_t = \pi_t^\star(s_t)\) where \(\pi_t^\star(s) = K^\star_t s\) and
\(a_t\)
Claim: MPC policy is linear \(\pi_t^\star(s) = \gamma^\mathsf{pos} \mathsf{pos}_t + \gamma^\mathsf{vel} \mathsf{vel}_t\)
Simulation notebook: MPC_feasibility_example.ipynb
\(a_t\)
Claim: optimal policy is linear \(\pi_t^\star(s) = \gamma^\mathsf{pos}_t \mathsf{pos}_t + \gamma_t^\mathsf{vel} \mathsf{vel}_t\)
Notice: gains \(\approx\) static for early part of horizon
The state is position & velocity \(s=[\theta,\omega]\) with \( s_{t+1} = \begin{bmatrix} 1 & 0.1\\ & 1 \end{bmatrix}s_t + \begin{bmatrix} 0\\ 1 \end{bmatrix}a_t\)
Are trajectories safe as long as \(|\theta_0|<1\)?
\(a_t\)
The state is position & velocity \(s=[\theta,\omega]\) with \( s_{t+1} = \begin{bmatrix} 1 & 0.1\\ & 1 \end{bmatrix}s_t + \begin{bmatrix} 0\\ 1 \end{bmatrix}a_t\)
Goal: stay near origin and be energy efficient
The state is position & velocity \(s=[\theta,\omega]\) with \( s_{t+1} = \begin{bmatrix} 1 & 0.1\\ & 1 \end{bmatrix}s_t + \begin{bmatrix} 0\\ 1 \end{bmatrix}a_t\)
Goal: stay near origin and be energy efficient
Definitions:
A state \(s\) is safe if \(\mathcal s\in\mathcal S_\mathrm{safe}\).
A trajectory of states \((s_0,\dots,s_t)\) is safe if \(\mathcal s_k\in\mathcal S_\mathrm{safe}\) for all \(0\leq k\leq t\).
(we can analogously define \(\mathcal A_\mathrm{safe}\subseteq \mathcal A\) and require that \(\mathcal a_k\in\mathcal A_\mathrm{safe}\) for all \(0\leq k\leq t\))
Definitions:
A state \(s\) is safe if \(\mathcal s\in\mathcal S_\mathrm{safe}\).
A trajectory of states \((s_0,\dots,s_t)\) is safe if \(\mathcal s_k\in\mathcal S_\mathrm{safe}\) for all \(0\leq k\leq t\).
A system \(s_{t+1}=F(s_t)\) is safe if some \(\mathcal S_\mathrm{inv}\subseteq \mathcal S_{\mathrm{safe}}\) is invariant, i.e.
Exercise: Prove that if \(\mathcal S_\mathrm{inv}\) is invariant for dynamics \(F\), then \(s_0\in \mathcal S_\mathrm{inv} \implies s_t\in\mathcal S_\mathrm{inv}\) for all \(t\).
\((Fs)^\top \sum_{t=0}^\infty (F^t)^\top F^t (Fs) \)
\(= s^\top \sum_{t=1}^\infty (F^t)^\top F^t s \)
\(\leq s^\top \sum_{t=0}^\infty (F^t)^\top F^t s \leq c\)
Example: An invariant set for
\(s=[\theta,\omega]\) with \( s_{t+1} = \begin{bmatrix} 0.9 & 0.1\\ & 0.9 \end{bmatrix}s_t \)
Claim: if \(V(s)\) is a Lyapunov function for \(F\) then any sublevel set \(\{V(s)\leq c\}\) is invariant.
Definition: A Lyapunov function \(V:\mathcal S\to \mathbb R\) for \(F\) is continuous and
$$\min_{a_0,\dots,a_{H}} \quad\sum_{k=0}^{H-1} c(s_{k}, a_{k}) +\textcolor{cyan}{ c_H(s_H)}$$
\(\text{s.t.}\quad s_0 = s,\quad s_{k+1} = F(s_{k}, a_{k})\qquad\qquad\)
\(s_k\in\mathcal S_\mathrm{safe},\quad a_k\in\mathcal A_\mathrm{safe},\quad \textcolor{cyan}{s_H\in\mathcal S_H}\)
Fact 3: Under the following assumptions, MPC is recursively feasible, i.e. able to guarantee safety indefinitely:
Recursive feasibility: feasible at \(s_t\implies\) feasible at \(s_{t+1}\)
Proof:
$$\min_{a_0,\dots, a_{H}} \quad\sum_{k=0}^{H-1} c(s_{k}, a_{k})\qquad\text{s.t.}\quad s_0 = s_t,\quad s_{k+1} = F(s_{k}, a_{k})$$
\(s_k\in\mathcal S_\mathrm{safe},\quad a_k\in\mathcal A_\mathrm{safe},\quad \textcolor{cyan}{s_H=0}\)
Fact 4: Under the following assumptions, MPC is provably stabilizing, i.e. $$\tilde F(s) = F(s,\pi_{MPC}(s))~~\text{is stable}$$
$$\min_{a_0,\dots, a_{H}} \quad\sum_{k=0}^{H-1} c(s_{k}, a_{k})\qquad\text{s.t.}\quad s_0 = s_t,\quad s_{k+1} = F(s_{k}, a_{k})$$
\(s_k\in\mathcal S_\mathrm{safe},\quad a_k\in\mathcal A_\mathrm{safe},\quad \textcolor{cyan}{s_H=0}\)
Fact 4: Under the following assumptions, MPC is provably stabilizing, i.e. $$\tilde F(s) = F(s,\pi_{MPC}(s))~~\text{is stable}$$
Proof: Let \(J^\star(s)\) be the objective value for the above with \(s_t=s\). Then the function \(J^\star(s)\) is positive definite and strictly decreasing. Therefore, it is a Lyapunov function for the closed loop dynamics \(F(\cdot, \pi_\mathrm{MPC}(\cdot))\), certifying asymptotic stability.
$$\min_{a_0,\dots, a_{H}} \quad\sum_{k=0}^{H-1} c(s_{k}, a_{k})\qquad\text{s.t.}\quad s_0 = s_t,\quad s_{k+1} = F(s_{k}, a_{k})$$
\(s_k\in\mathcal S_\mathrm{safe},\quad a_k\in\mathcal A_\mathrm{safe},\quad \textcolor{cyan}{s_H=0}\)
Fact 3 and 4 (general): Under the following assumptions, MPC is recursively feasible, i.e. able to guarantee safety indefinitely, and provably stabilizing, i.e. $$\tilde F(s) = F(s,\pi_{MPC}(s))~~\text{is stable}$$
$$\min_{a_0,\dots,a_{H}} \quad\sum_{k=0}^{H-1} c(s_{k}, a_{k}) +\textcolor{cyan}{ c_H(s_H)}$$
\(\text{s.t.}\quad s_0 = s,\quad s_{k+1} = F(s_{k}, a_{k})\qquad\qquad\)
\(s_k\in\mathcal S_\mathrm{safe},\quad a_k\in\mathcal A_\mathrm{safe},\quad \textcolor{cyan}{s_H\in\mathcal S_H}\)
Recursive feasibility: feasible at \(s_t\implies\) feasible at \(s_{t+1}\)
Proof:
Proof:
\(J^\star(s)\) is positive definite and strictly decreasing. Therefore, the closed loop dynamics \(F(\cdot, \pi_\mathrm{MPC}(\cdot))\) are asymptotically stable.
References: Predictive Control by Borrelli, Bemporad, Morari
Next time: control from partial observation