CS 4/5789: Introduction to Reinforcement Learning

Lecture 10

Prof. Sarah Dean

MW 2:45-4pm
110 Hollister Hall



0. Announcements & Recap

1. MBRL with Query Model

2. Tabular Sample Complexity

3. LQR Sample Complexity



No lecture on Monday (Feb Break)


HW1 due 3/7


5789 Paper Review Assignment (weekly pace suggested)


Office hours after lecture M (110 Hollister) and W (416A Gates)


Consider features and labels \((x,y)\sim \mathcal D\) with \(y=f_\star(x) + w\)

  • Tabular function estimation
    \(\forall~~x,~~|\widehat f (x) - f_\star(x)| \leq \epsilon,\qquad N \gtrsim \frac{|\mathcal X|}{\epsilon^2}\)
  • Parameter estimation \(f_\star = f_{\theta_\star}\), \(\widehat f = f_{\widehat \theta}\)
    \(\|\widehat \theta - \theta_\star\| \leq \epsilon,\qquad N \gtrsim \frac{d}{\epsilon^2}\)
  • Prediction error analysis
    \(\mathbb E[\ell(\widehat f(x), f_\star(x))] \leq \epsilon,\qquad N \gtrsim \frac{1}{\epsilon^2}\)

Infinite horizon Tabular MDP \(\mathcal M = \{\mathcal S, \mathcal A, P, r, \gamma\}\)

Finite horizon continuous MDP \(\mathcal M = \{\mathbb R^{n_s},\mathbb R^{n_a}, f, c, H, \mu_0\}\)

CS 4/5789: Lecture 10

By Sarah Dean


CS 4/5789: Lecture 10