Sarah Dean PRO
asst prof in CS at Cornell
Fall 2025, Prof Sarah Dean
"What we do"
"What we do" (simplified)
"Why we do it"
Suppose \(\mathbb E[y_t|x_t] = \Theta_\star^\top x_t\) and \(y_t\) has bounded variance. Define \(V=\sum_{k=1}^N x_kx_k^\top \). Then with high probability,
$$\|\Theta_\star-\hat\Theta\|_{V}^2 = tr\big((\Theta_\star-\hat\Theta)^\top V(\Theta_\star-\hat\Theta)\big)\leq \beta $$
By the definition of minimum eigenvalue, $$\|\Theta_\star-\hat\Theta\|_F\leq\sqrt{ \frac{\beta}{\lambda_{\min}(V)}} \leq \sqrt{ \frac{\beta H}{ \mu }} \frac{1}{\sqrt{N}}$$
\(\implies \sum_{k=1}^N x_kx_k^\top \succeq (N/H)\mu I\)
The (steady-state) Kalman filter (where \(\tilde F=F-LHF\)) $$ \hat s_{t+1} = \tilde F\hat s_t + \tilde G a_t + Ly_{t+1},\quad \hat y_t = H\hat s_t$$
"Unrolling" shows linear model $$\hat s_{t} = \tilde F^{L}\hat s_{t-L}+ \sum_{k=1}^{L} \tilde F^{k-1} (Ly_{t-k+1}+ \tilde Ga_{t-k}), \quad \hat y_{t+1} = H(F\hat s_t + Ga_t)$$
For truncated history, \(\mathbb E[ s_{t-L}| x_t] = 0\) (due to zero mean initial state and exploration actions)
"Unrolling" (steady-state) Kalman filter $$\mathbb E[y_{t+1}|x_t] = H F \Big(\sum_{k=1}^{L} \tilde F^{k-1} (Ly_{t-k+1}+ \tilde Ga_{t-k})\Big) + H Ga_t$$
Thus \(\Theta_\star\) depends on \(F,G,H\) and Kalman gain
Certainty Equivalence is Efficient for Linear Quadratic Control Mania, Tu, Recht. 2019.
Next time: Policy optimization
By Sarah Dean