Prof Sarah Dean

## Reminders

• Office hours this week moved to Friday 9-10am
• cancelled next week due to travel
• Feedback on final project proposal this week
• Upcoming paper presentations starting next week
• Project midterm update due 11/11

policy

$$\pi_t:\mathcal S\to\mathcal A$$

observation

$$s_t$$

accumulate

$$\{(s_t, a_t, c_t)\}$$

## Action in a dynamic world

Goal: select actions $$a_t$$ to bring environment to low-cost states

action

$$a_{t}$$

## $$F$$

$$s$$

## Controlled systems

$$s_{t+1} = F(s_t, a_t, w_t)$$

## $$F$$

$$s$$

$$a_t$$

$$s_t$$

$$w_t$$

## Controlled systems

$$s_{t+1} = As_t+B a_t+ w_t$$

## $$A$$

$$s$$

$$a_t$$

$$s_t$$

$$w_t$$

Linear System: State space $$\mathcal S = \mathbb R^n$$ and actions $$\mathcal A=\mathbb R^m$$ with dynamics defined by  $$A\in\mathbb R^{n\times n}$$ and $$B\in\mathbb R^{n\times m}$$

$$B$$

• $$s_\star$$ is reachable from $$s_0$$ if there exists $$a_{0:t-1} \in\mathcal A^t$$ such that $$s_{t}=s_\star$$ for some $$t$$
• system is controllable if any $$s_\star\in\mathcal S$$ is reachable from any $$s_0 \in\mathcal S$$.

• a linear system is controllable if and only if $$\mathrm{rank}\Big(\underbrace{\begin{bmatrix}B&AB &\dots & A^{n-1}B\end{bmatrix}}_{\mathcal C}\Big) = n$$

## Optimal Control & Dynamic Programming

Stochastic Optimal Control Problem

$$\min_{\pi_{0:T}} ~~\mathbb E_w\Big[\sum_{k=0}^{T} c(s_k, \pi_k(s_k)) \Big]\quad \text{s.t}\quad s_0~~\text{given},~~s_{k+1} = F(s_k, \pi_k(s_k),w_k)$$

$$\underbrace{\qquad\qquad}_{J^\pi(s_0)}$$

Dynamic Programming Algorithm

• Initialize $$J_{T+1}^\star (s) = 0$$
• For $$k=T,T-1,\dots,0$$:
• Compute $$J_k^\star (s) = \min_{a\in\mathcal A} c(s, a)+\mathbb E_w[J_{k+1}^\star (F(s,a,w))]$$
• Record minimizing argument as $$\pi_k^\star(s)$$

Reference: Ch 1 in Dynamic Programming & Optimal Control, Vol. I by Bertsekas

• Linear dynamics: $$F(s, a, w) = A s+Ba+w$$
• Quadratic costs: $$c(s, a) = s^\top Qs + a^\top Ra$$ where $$Q,R\succ 0$$
• Stochastic and independent noise $$\mathbb E[w_k] = 0$$ and $$\mathbb E[w_kw_k^\top] = \sigma^2 I$$

LQR Problem

$$\min_{\pi_{0:T}} ~~\mathbb E_w\Big[\sum_{k=0}^{T} s_k^\top Qs_k + a_k^\top Ra_k \Big]\quad \text{s.t}\quad s_{k+1} = A s_k+ Ba_k+w_k$$

$$a_k=\pi_k(s_k)$$

## LQR Example

$$s_{t+1} = \begin{bmatrix} 0.9 & 0.1\\ & 0.9 \end{bmatrix}s_t + \begin{bmatrix}0\\1\end{bmatrix}a_t + w_t$$

The state is position & velocity $$s=[\theta,\omega]$$, input is a force $$a\in\mathbb R$$.

Goal: stay near origin and be energy efficient

• $$c(s,a) = s^\top \begin{bmatrix} 10 & \\ & 0.1 \end{bmatrix}s + 5a^2$$

## LQR via DP

• $$k=T$$: $$\qquad\min_{a} s^\top Q s+a^\top Ra+0$$
• $$J_T^\star(s) = s^\top Q s$$ and $$\pi_T^\star(s) =0$$
• $$k=T-1$$: $$\quad \min_{a} s^\top Q s+a^\top Ra+\mathbb E_w[(As+Ba+w)^\top Q (As+Ba+w)]$$

DP: $$J_k^\star (s) = \min_{a\in\mathcal A} c(s, a)+\mathbb E_w[J_{k+1}^\star (F(s,a,w))]$$

• $$\mathbb E[(As+Ba+w)^\top Q (As+Ba+w)]$$
• $$=(As+Ba)^\top Q (As+Ba)+\mathbb E[ 2w^\top Q(As+Ba) + w^\top Q w]$$
• $$=(As+Ba)^\top Q (As+Ba)+\mathrm{tr}( Q )$$

## LQR via DP

• $$k=T$$: $$\qquad\min_{a} s^\top Q s+a^\top Ra+0$$
• $$J_T^\star(s) = s^\top Q s$$ and $$\pi_T^\star(s) =0$$
• $$k=T-1$$: $$\quad \min_{a} s^\top (Q+A^\top QA) s+a^\top (R+B^\top QB) a+2s^\top A^\top Q Ba+\mathrm{tr}( Q )$$

DP: $$J_k^\star (s) = \min_{a\in\mathcal A} c(s, a)+\mathbb E_w[J_{k+1}^\star (F(s,a,w))]$$

• $$\min_a a^\top M a + m^\top a + c$$
• $$2Ma_\star + m = 0 \implies a_\star = -\frac{1}{2}M^{-1} m$$
• $$\pi_{T-1}^\star(s)=-\frac{1}{2}(R+B^\top QB)^{-1}(2B^\top QAs)$$
• $$\mathbb E[(As+Ba+w)^\top Q (As+Ba+w)]=(As+Ba)^\top Q (As+Ba)+\mathrm{tr}( Q )$$

## LQR via DP

• $$k=T$$: $$\qquad\min_{a} s^\top Q s+a^\top Ra+0$$
• $$J_T^\star(s) = s^\top Q s$$ and $$\pi_T^\star(s) =0$$
• $$k=T-1$$: $$\quad \min_{a} s^\top (Q+A^\top QA) s+a^\top (R+B^\top QB) a+2s^\top A^\top Q Ba+\mathrm{tr}( Q )$$
• $$\pi_{T-1}^\star(s)=-(R+B^\top QB)^{-1}B^\top QAs$$
• $$J_T^\star(s) = s^\top (Q+A^\top QA + A^\top QB(R+B^\top QB)^{-1}B^\top QA) s +\mathrm{tr}( Q )$$

DP: $$J_k^\star (s) = \min_{a\in\mathcal A} c(s, a)+\mathbb E_w[J_{k+1}^\star (F(s,a,w))]$$

Claim:  For $$t=0,\dots T$$, the optimal cost-to-go function is quadratic and the optimal policy is linear

• $$J^\star_t (s) = s^\top P_t s + p_t$$ and $$\pi_t^\star(s) = K_t s$$
• Exercise: Using DP and induction, prove the claim for:
• $$P_t = Q+A^\top P_{t+1}A + A^\top P_{t+1}B(R+B^\top P_{t+1}B)^{-1}B^\top P_{t+1}A$$
• $$p_t = p_{t+1} + \sigma^2\mathrm{tr}(P_{t+1})$$
• $$K_t = -(R+B^\top P_{t+1}B)^{-1}B^\top P_{t+1}A$$
• Exercise: Derive expressions for optimal controllers when
1. Time varying cost: $$c_t(s,a) = s^\top Q_t s+a^\top R_t a$$
2. General noise covariance: $$\mathbb E[w_tw_t^\top] = \Sigma_t$$
3. Trajectory tracking: $$c_t(s,a) = \|s-\bar s_t\|_2^2 + \|a\|_2^2$$ for given $$\bar s_t$$

## LQR Example

$$s_{t+1} = \begin{bmatrix} 0.9 & 0.1\\ & 0.9 \end{bmatrix}s_t + \begin{bmatrix}0\\1\end{bmatrix}a_t + w_t$$

The state is position & velocity $$s=[\theta,\omega]$$, input is a force $$a\in\mathbb R$$.

Goal: stay near origin and be energy efficient

• $$c(s,a) = s^\top \begin{bmatrix} 10 & \\ & 0.1 \end{bmatrix}s + 5a^2$$

$$\pi_\star(s) \approx -\begin{bmatrix} 7.0\times 10^{-2}& 3.7\times 10^{-2}\end{bmatrix} s$$

## Convexity of Open-Loop LQR

$$\min_{a_{0:T}} ~~\sum_{k=0}^{T} s_k^\top Qs_k + a_k^\top Ra_k \quad \text{s.t}\quad s_{k+1} = A s_k+ Ba_k$$

• Quadratic cost, linear constraints $$\implies$$ Quadratic Program
• Define $$\mathbf s = \begin{bmatrix}s_0 \\ \vdots \\ s_T\end{bmatrix},\quad \mathbf a = \begin{bmatrix}a_0 \\ \vdots \\ a_{T-1}\end{bmatrix}, \quad \bar A = \begin{bmatrix}A \\ &\ddots \\&& A\end{bmatrix}$$ and similarly for $$\bar B,\bar Q,\bar R$$.

## Convexity of Open-Loop LQR

$$\min_{\mathbf a} ~~\mathbf s^\top \bar Q\mathbf s + \mathbf a^\top \bar R\mathbf a\quad \text{s.t}\quad \mathbf s_{1:T} = \bar A \mathbf s_{0:T-1}+ \bar B\mathbf a$$

• Quadratic cost, linear constraints $$\implies$$ Quadratic Program
• Define $$\mathbf s = \begin{bmatrix}s_0 \\ \vdots \\ s_T\end{bmatrix},\quad \mathbf a = \begin{bmatrix}a_0 \\ \vdots \\ a_{T-1}\end{bmatrix}, \quad \bar A = \begin{bmatrix}A \\ &\ddots \\&& A\end{bmatrix}$$ and similarly for $$\bar B,\bar Q,\bar R$$.

$$\min_{K_{0:T}} ~~\mathbb E_w\Big[\sum_{k=0}^{T} s_k^\top (Q + K_k^\top R K_k)s_k \Big]\quad \text{s.t}\quad s_{k+1} = (A +BK_k)s_k+w_k$$

Exercise: Prove $$s_{t+1}=A_t s_t + w_t\implies$$ $$s_t = (A_{t-1}A_{t-2}\cdots A_0) s_0 + \sum_{k=0}^{t-2} (A_{t-1}\cdots A_{k+1}) w_{k}+w_{t-1}$$

## Non-convexity of LQR

Example: For a 1D system with $$A=B=1$$, $$\mathbb E[s_2] = (1+K_1)(1+K_0)s_0$$

## System-level Reparametrization

$$B$$

$$A$$

$$s$$

$$w_t$$

$$a_t$$

$$s_t$$

$$\mathbf{K}$$

$$= \prod_{\ell=0}^{t}(A+BK_{\ell}) s_0 + \sum_{k=0}^{t} \prod_{\ell=k+1}^{t}(A+BK_\ell) w_{k}$$

$$= K_t \prod_{\ell=0}^{t-1}(A+BK_{\ell}) s_0 + K_t \sum_{k=0}^{t-1} \prod_{\ell=k+1}^{t-1}(A+BK_\ell) w_{k}$$

$$s_{t+1} = As_{t}+Ba_{t}+w_{t}$$

$$a_t = K_t s_t$$

$$s_{t} = \Phi_s^{t,0} s_0 + \sum_{k=1}^t \Phi_s^{t, k}w_{t-k}$$

$$a_{t} = \Phi_a^{t,0} s_0 + \sum_{k=1}^t \Phi_a^{t, k}w_{t-k}$$

# $$\mathbf{\Phi}$$

$$B$$

$$A$$

$$s$$

$$w_t$$

$$a_t$$

$$s_t$$

$$\mathbf{K}$$

$$= \prod_{\ell=0}^{t}(A+BK_{\ell}) s_0 + \sum_{k=0}^{t} \prod_{\ell=k+1}^{t}(A+BK_\ell) w_{k}$$

$$= K_t \prod_{\ell=0}^{t-1}(A+BK_{\ell}) s_0 + K_t \sum_{k=0}^{t-1} \prod_{\ell=k+1}^{t-1}(A+BK_\ell) w_{k}$$

$$s_{t+1} = As_{t}+Ba_{t}+w_{t}$$

$$a_t = K_t s_t$$

$$\begin{bmatrix} s_{0}\\\vdots \\s_T\end{bmatrix} = \begin{bmatrix} \Phi_s^{0,0}\\ \Phi_s^{1, 1}& \Phi_s^{1,0}\\ \vdots & \ddots & \ddots \\ \Phi_s^{T,T} & \Phi_s^{T,T-1} & \dots & \Phi_s^{T,0} \end{bmatrix} \begin{bmatrix} s_0\\w_0\\ \vdots \\w_{T-1}\end{bmatrix}$$

$$\begin{bmatrix} a_{0}\\\vdots \\a_T\end{bmatrix} = \begin{bmatrix} \Phi_a^{0,0}\\ \Phi_a^{1, 1}& \Phi_a^{1,0}\\ \vdots & \ddots & \ddots \\ \Phi_a^{T,T} & \Phi_a^{T,T-1} & \dots & \Phi_a^{T,0} \end{bmatrix} \begin{bmatrix} s_0\\w_0\\ \vdots \\w_{T-1}\end{bmatrix}$$

$$\mathbf s = \mathbf \Phi_s \mathbf w$$

$$\mathbf a = \mathbf \Phi_a \mathbf w$$

Reparametrized objective:

$$\mathbf s^\top \bar Q\mathbf s + \mathbf a^\top \bar R\mathbf a = \mathbf w^\top (\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s + \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )\mathbf w$$

## System-level Reparametrization

$$\mathbf s_{1:T} = \bar A \mathbf s_{0:T-1}+ \bar B\mathbf a+\mathbf w_{0:T-1}$$

Reparametrized objective: $$\mathbf s^\top \bar Q\mathbf s + \mathbf a^\top \bar R\mathbf a = \mathbf w^\top (\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s + \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )\mathbf w$$

Reparametrized constraints:

$$\iff\quad\mathbf \Phi_s\mathbf w = \mathcal Z \bar A \mathbf \Phi_s\mathbf w + \mathcal Z \bar B \mathbf \Phi_a\mathbf w + \mathbf w$$

$$\begin{bmatrix}s_0 \\ \vdots \\ s_T\end{bmatrix}= \underbrace{\begin{bmatrix}0\\ A \\ &\ddots \\&& A\end{bmatrix} }_{\mathcal Z \bar A}\begin{bmatrix}s_0 \\ \vdots \\ s_T\end{bmatrix} + \underbrace{\begin{bmatrix}0\\ B \\ &\ddots \\&& B\end{bmatrix} }_{\mathcal Z \bar B} \begin{bmatrix}a_0 \\ \vdots \\ a_{T-1}\end{bmatrix} +\begin{bmatrix} s_0\\w_0\\ \vdots \\w_{T-1}\end{bmatrix}$$

## System-level Reparametrization

$$s_{t+1} = As_{t}+Ba_{t}+w_{t}$$

Reparametrized objective: $$\mathbf s^\top \bar Q\mathbf s + \mathbf a^\top \bar R\mathbf a = \mathbf w^\top (\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s + \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )\mathbf w$$

Reparametrized constraints:

$$\sum_{k=0}^{t+1} \Phi_s^{t+1, k}w_{t+1-k} = A\sum_{k=0}^t \Phi_s^{t, k}w_{t-k} + B\sum_{k=0}^t \Phi_a^{t, k}w_{t-k} + w_{t}$$

(let $$w_{-1}=s_0$$)

Claim: The above equality is implied by

$$\Phi_s^{t,0}=I,\quad \Phi_s^{t, k+1} = A \Phi_s^{t, k}+B\Phi_a^{t, k} \quad \forall ~t,~k\leq t$$

References: System Level Synthesis by Anderson, Doyle, Low, Matni

## System Level Synthesis

Theorem: For the a linear system in feedback with a linear controller over the horizon $$t=0,\dots, T$$:

1. The affine subspace $$\{(I - \mathcal Z \bar A )\mathbf \Phi_s- \mathcal Z \bar B \mathbf \Phi_a = I\}$$ parametrizes all possible system responses.
2. For any block-lower-triangular matrices $$(\mathbf \Phi_s,\mathbf \Phi_a)$$ in the affine subspace, there exists a linear feedback controller achieving this response.

Example: For a 1D system with $$A=B=1$$,

• $$s_1 = (1 + K_0) s_0 + w_0$$
• $$s_2 = (1+K_1)(1+K_0)s_0 + (1+K_1)w_0 + w_1$$

## 1D Example

• Suppose $$K_0 = K_1=-\frac{1}{2}$$. What are $$\mathbf \Phi_s$$ and $$\mathbf \Phi_u$$?
• $$\mathbf \Phi_s = \begin{bmatrix} 1\\ \frac{1}{2} & 1\\ \frac{1}{4} & \frac{1}{2} & 1\end{bmatrix}$$ and $$\mathbf \Phi_u = \begin{bmatrix} -\frac{1}{2} \\ -\frac{1}{4}& \frac{1}{2}\end{bmatrix}$$
• Is there some $$K_0,K_1$$ such that $$\mathbf \Phi_s = \begin{bmatrix} \frac{1}{2}\\ \frac{1}{4} & \frac{1}{2}\\ \frac{1}{8} & \frac{1}{4} & \frac{1}{2}\end{bmatrix}$$?

## System Level LQR

$$\min_{\mathbf \Phi} ~~\mathbb E_w\Big[ \mathbf w^\top(\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s+ \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )\mathbf w \Big]\quad \text{s.t}\quad (I - \mathcal Z \bar A )\mathbf \Phi_s- \mathcal Z \bar B \mathbf \Phi_a = I$$

• $$\mathbb E_w\Big[ \mathbf w^\top(\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s+ \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )\mathbf w \Big]$$
• $$=\mathbb E_w\Big[ \mathrm{tr}((\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s+ \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )\mathbf w\mathbf w^\top) \Big]$$
• $$=\sigma^2\mathrm{tr}(\mathbf \Phi_s^\top \bar Q \mathbf \Phi_s+ \mathbf \Phi_a^\top \bar R \mathbf \Phi_a )$$
• $$=\sigma^2\left\|\begin{bmatrix}\bar Q^{1/2} \\&\bar R^{1/2}\end{bmatrix} \begin{bmatrix} \mathbf \Phi_s\\ \mathbf \Phi_a \end{bmatrix}\right\|_F^2$$

## A System Level Perspective

In closed loop, state and input are linear functions of disturbance

$$x_t = \sum_{k=0}^t A^{k}(Bu_{t-k} + w_{t-k})$$

$$u_t = \sum_{k=0}^t K_kx_{t-k}$$

$$\begin{bmatrix} x_t\\u_t \end{bmatrix} = \sum_{k=0}^t \begin{bmatrix} \Phi_x(t)\\ \Phi_u(t) \end{bmatrix} w_{t-k}$$

Instead of reasoning about a controller $$\mathbf{K}$$, we reason about the interconnection $$\mathbf\Phi$$ directly.

system looks like a line

$$(A,B)$$

$$\mathbf{K}$$

$$\bf x$$

$$\bf u$$

$$\bf w$$

$$\bf x$$

$$\bf u$$

$$\bf w$$

# $$\mathbf{\Phi}$$

$$u_t = \sum_{k=0}^t{\color{Goldenrod} K_k} x_{t-k}$$

$$\underset{\mathbf u }{\min}$$   $$\displaystyle\lim_{T\to\infty}\mathbb{E}\left[ \frac{1}{T}\sum_{t=0}^T x_t^\top Q x_t + u_t^\top R u_t\right]$$

$$\text{s.t.}~~x_{t+1} = Ax_t + Bu_t + w_t$$

$$\begin{bmatrix} \mathbf x\\ \mathbf u\end{bmatrix} = \begin{bmatrix} \mathbf \Phi_x\\ \mathbf \Phi_u\end{bmatrix}\mathbf w$$

$$\text{s.t.}~~ {\color{teal}\mathbf\Phi }\in\mathrm{Affine}(A, B)$$

$$\underset{\color{teal}\mathbf{\Phi}}{\min}$$$$\left\| \begin{bmatrix} Q^{1/2} &\\& R^{1/2}\end{bmatrix} \begin{bmatrix}\color{teal} \mathbf{\Phi}_x \\ \color{teal} \mathbf{\Phi}_u \end{bmatrix} \right\|_{\mathcal{H}_2}^2$$

To implement resulting controller:

• $$w_{-1}=s_0$$ and $$a_0=\Phi_a^{0, 0}w_{-1}$$
• for $$t=1, \dots, T$$
• $$w_{t-1} = s_{t}-As_{t-1}-Ba_{t-1}$$
• $$a_{t} = \sum_{k=0}^{t} \Phi_a^{t, k}w_{t-k-1}$$

## System Level LQR

$$\min_{\mathbf \Phi} ~~\left\|\begin{bmatrix}\bar Q^{1/2} \\&\bar R^{1/2}\end{bmatrix} \begin{bmatrix} \mathbf \Phi_s\\ \mathbf \Phi_a \end{bmatrix}\right\|_F^2\quad \text{s.t}\quad \begin{bmatrix} I - \mathcal Z \bar A & - \mathcal Z \bar B\end{bmatrix} \begin{bmatrix} \mathbf \Phi_s\\ \mathbf \Phi_a \end{bmatrix}= I$$

## Recap: System Level LQR

$$u_t = {\color{Goldenrod} K_t }s_{t}$$

$$\underset{\mathbf u }{\min}$$   $$\displaystyle\mathbb{E}\left[\sum_{t=0}^T s_t^\top Q s_t + a_t^\top R a_t\right]$$

$$\text{s.t.}~~s_{t+1} = As_t + Ba_t + w_t$$

$$\begin{bmatrix} \mathbf x\\ \mathbf u\end{bmatrix} = \begin{bmatrix} \mathbf \Phi_x\\ \mathbf \Phi_u\end{bmatrix}\mathbf w$$

$$\underset{\color{teal}\mathbf{\Phi}}{\min}$$$$\left\| \begin{bmatrix}\bar Q^{1/2} &\\& \bar R^{1/2}\end{bmatrix} \begin{bmatrix}\color{teal} \mathbf{\Phi}_x \\ \color{teal} \mathbf{\Phi}_u \end{bmatrix} \right\|_{F}^2$$

$$\text{s.t.}~~ \begin{bmatrix} I - \mathcal Z \bar A & - \mathcal Z \bar B\end{bmatrix} \begin{bmatrix}\color{teal} \mathbf{\Phi}_x \\ \color{teal} \mathbf{\Phi}_u \end{bmatrix}= I$$

$$B$$

$$A$$

$$s$$

$$w_t$$

$$a_t$$

$$s_t$$

$$\mathbf{K}$$

# $$\mathbf{\Phi}$$

$$B$$

$$A$$

$$s$$

$$w_t$$

$$a_t$$

$$s_t$$

$$\mathbf{K}$$

system looks like a line

Infinite Horizon LQR Problem

$$\min_{\pi_{0:T}} ~~\lim_{T\to\infty}\mathbb E_w\Big[\frac{1}{T}\sum_{k=0}^{T} s_k^\top Qs_k + a_k^\top Ra_k \Big]\quad \text{s.t}\quad s_{k+1} = A s_k+ Ba_k+w_k$$

Claim:  The optimal cost-to-go function is quadratic and the optimal policy is linear $$J^\star (s) = s^\top P s,\qquad \pi^\star(s) = K s$$

• $$P = Q+A^\top PA + A^\top PB(R+B^\top PB)^{-1}B^\top PA$$
• Discrete Algebraic Riccati Equation: $$P=\mathrm{DARE}(A,B,Q,R)$$
• $$K = -(R+B^\top PB)^{-1}B^\top QPA$$

$$B$$

$$A$$

$$s$$

$$w_t$$

$$a_t$$

$$s_t$$

$$\mathbf{K}$$

$$= (A+BK)^{t+1} s_0 + \sum_{k=0}^{t} (A+BK)^{t-k} w_{k}$$

$$= K(A+BK)^{t+1} s_0 + \sum_{k=0}^{t} K(A+BK)^{t-k} w_{k}$$

$$s_{t+1} = As_{t}+Ba_{t}+w_{t}$$

$$a_t = K s_t$$

$$s_{t} = \Phi_s^{0} s_0 + \sum_{k=1}^t \Phi_s^{k}w_{t-k}$$

$$a_{t} = \Phi_a^{0} s_0 + \sum_{k=1}^t \Phi_a^{k}w_{t-k}$$

# $$\mathbf{\Phi}$$

$$\begin{bmatrix} s_{0}\\\vdots \\s_T\end{bmatrix} = \begin{bmatrix} \Phi_s^{0}\\ \Phi_s^{ 1}& \Phi_s^{0}\\ \vdots & \ddots & \ddots \\ \Phi_s^{T} & \Phi_s^{T-1} & \dots & \Phi_s^{0} \end{bmatrix} \begin{bmatrix} s_0\\w_0\\ \vdots \\w_{T-1}\end{bmatrix}$$

$$\begin{bmatrix} a_{0}\\\vdots \\a_T\end{bmatrix} = \begin{bmatrix} \Phi_a^{0}\\ \Phi_a^{1}& \Phi_a^{0}\\ \vdots & \ddots & \ddots \\ \Phi_a^{T} & \Phi_a^{T-1} & \dots & \Phi_a^{0} \end{bmatrix} \begin{bmatrix} s_0\\w_0\\ \vdots \\w_{T-1}\end{bmatrix}$$

## Sequences & Operators

• Cost depends on the (semi-)infinite sequence $$\mathbf s = (s_0, s_1, s_2,\dots)$$
• Generated by convolution between disturbance sequence $$\mathbf w = (w_{-1}, w_0, w_1,\dots)$$ and (semi-)infinite operator $$\mathbf \Phi_s = (\Phi_s^0, \Phi_s^1,\dots)$$
• We represent this convolution with the notation $$\mathbf s = \mathbf \Phi_s\mathbf w$$
• Concretely,
• semi-infinite vectors and Toeplitz matrices
• frequency domain

## Sequences & Operators

• $$\mathbf s = (s_0, s_1, s_2,\dots)$$, $$\mathbf w = (w_{-1}, w_0, w_1,\dots)$$, and $$\mathbf \Phi_s = (\Phi_s^0, \Phi_s^1,\dots)$$ $$\mathbf s = \mathbf \Phi_s\mathbf w$$
• Concretely,
• semi-infinite vectors and Toeplitz matrices $$\begin{bmatrix} s_{0}\\\vdots \\s_t\\\vdots \end{bmatrix} = \begin{bmatrix} \Phi_s^{0}\\ \Phi_s^{ 1}& \Phi_s^{0}\\ \vdots & \ddots & \ddots \\ \Phi_s^{t} & \Phi_s^{t-1} & \dots & \Phi_s^{0} \\ \vdots & & \ddots &&\ddots \end{bmatrix} \begin{bmatrix} s_0\\w_0\\ \vdots \\w_{t-1} \\\vdots \end{bmatrix}$$
• frequency domain

## Sequences & Operators

• $$\mathbf s = (s_0, s_1, s_2,\dots)$$, $$\mathbf w = (w_{-1}, w_0, w_1,\dots)$$, and $$\mathbf \Phi_s = (\Phi_s^0, \Phi_s^1,\dots)$$ $$\mathbf s = \mathbf \Phi_s\mathbf w$$
• Concretely,
• semi-infinite vectors and Toeplitz matrices
• frequency domain
• define time shift operator $$z$$ such that $$z(s_0, s_1,s_2 \dots) = (s_1, s_2,\dots)$$
• represent $$\mathbf s(z) = \sum_{t=0}^\infty z^{-t}s_t$$ and $$\mathbf \Phi_s(z) = \sum_{t=0}^\infty z^{-t}\Phi_s^t$$
• multiplication of polynomials: $$\mathbf \Phi_s(z) \mathbf w(z) = (\sum_{t=0}^\infty z^{-t}w_t)(\sum_{t=0}^\infty z^{-t}\Phi_s^t) = \sum_{t=0}^\infty z^{-t} \sum_{k=0}^\infty \Phi_s^k w_{t-k}$$

$$u_t = {\color{Goldenrod} K}s_{t}$$

$$\underset{\mathbf u }{\min}$$   $$\displaystyle\lim_{T\to\infty}\mathbb{E}\left[\frac{1}{T}\sum_{t=0}^T s_t^\top Q s_t + a_t^\top R a_t\right]$$

$$\text{s.t.}~~s_{t+1} = As_t + Ba_t + w_t$$

$$\begin{bmatrix} \mathbf x\\ \mathbf u\end{bmatrix} = \begin{bmatrix} \mathbf \Phi_x\\ \mathbf \Phi_u\end{bmatrix}\mathbf w$$

$$\underset{\color{teal}\mathbf{\Phi}}{\min}$$$$\left\| \begin{bmatrix}Q^{1/2} &\\& R^{1/2}\end{bmatrix} \begin{bmatrix}\color{teal} \mathbf{\Phi}_x \\ \color{teal} \mathbf{\Phi}_u \end{bmatrix} \right\|_{\mathcal H_2}^2$$

$$\text{s.t.}~~ \begin{bmatrix} zI - A & - B\end{bmatrix} \begin{bmatrix}\color{teal} \mathbf{\Phi}_x \\ \color{teal} \mathbf{\Phi}_u \end{bmatrix}= I$$

## Infinite Horizon LQR

Exercise: Using the frequency domain notation, derive the expression for the SLS cost and constraints, where we define the  norm:

$$\|\mathbf \Phi\|_{\mathcal H_2}^2 = \sum_{t=0}^\infty \|\Phi^t\|_F^2$$

## Recap

• $$\pi_t^\star(s) = K_t s$$
• System response parametrization
• $$\mathbf x = \mathbf \Phi_x \mathbf w, \quad\mathbf u = \mathbf \Phi_u \mathbf w$$
• Steady-state controllers and infinite horizons
• $$\pi^\star(s) = Ks$$

References: System Level Synthesis by Anderson, Doyle, Low, Matni and Ch 2 in Machine Learning in Feedback Systems by Sarah Dean

By Sarah Dean

Private