CS 4/5789: Introduction to Reinforcement Learning
Lecture 10
Prof. Sarah Dean
MW 2:45-4pm
110 Hollister Hall
Agenda
0. Announcements & Recap
1. MBRL with Query Model
2. Tabular Sample Complexity
3. LQR Sample Complexity
Announcements
No lecture on Monday (Feb Break)
HW1 due 3/7
5789 Paper Review Assignment (weekly pace suggested)
Office hours after lecture M (110 Hollister) and W (416A Gates)
Recap
Consider features and labels \((x,y)\sim \mathcal D\) with \(y=f_\star(x) + w\)
- Tabular function estimation
\(\forall~~x,~~|\widehat f (x) - f_\star(x)| \leq \epsilon,\qquad N \gtrsim \frac{|\mathcal X|}{\epsilon^2}\) - Parameter estimation \(f_\star = f_{\theta_\star}\), \(\widehat f = f_{\widehat \theta}\)
\(\|\widehat \theta - \theta_\star\| \leq \epsilon,\qquad N \gtrsim \frac{d}{\epsilon^2}\) - Prediction error analysis
\(\mathbb E[\ell(\widehat f(x), f_\star(x))] \leq \epsilon,\qquad N \gtrsim \frac{1}{\epsilon^2}\)

Infinite horizon Tabular MDP \(\mathcal M = \{\mathcal S, \mathcal A, P, r, \gamma\}\)
Finite horizon continuous MDP \(\mathcal M = \{\mathbb R^{n_s},\mathbb R^{n_a}, f, c, H, \mu_0\}\)


CS 4/5789: Lecture 10
By Sarah Dean