## CS 4/5789: Introduction to Reinforcement Learning

### Lecture 10

Prof. Sarah Dean

MW 2:45-4pm
110 Hollister Hall

## Agenda

0. Announcements & Recap

1. MBRL with Query Model

2. Tabular Sample Complexity

3. LQR Sample Complexity

## Announcements

No lecture on Monday (Feb Break)

HW1 due 3/7

5789 Paper Review Assignment (weekly pace suggested)

Office hours after lecture M (110 Hollister) and W (416A Gates)

## Recap

Consider features and labels $$(x,y)\sim \mathcal D$$ with $$y=f_\star(x) + w$$

• Tabular function estimation
$$\forall~~x,~~|\widehat f (x) - f_\star(x)| \leq \epsilon,\qquad N \gtrsim \frac{|\mathcal X|}{\epsilon^2}$$
• Parameter estimation $$f_\star = f_{\theta_\star}$$, $$\widehat f = f_{\widehat \theta}$$
$$\|\widehat \theta - \theta_\star\| \leq \epsilon,\qquad N \gtrsim \frac{d}{\epsilon^2}$$
• Prediction error analysis
$$\mathbb E[\ell(\widehat f(x), f_\star(x))] \leq \epsilon,\qquad N \gtrsim \frac{1}{\epsilon^2}$$

Infinite horizon Tabular MDP $$\mathcal M = \{\mathcal S, \mathcal A, P, r, \gamma\}$$

Finite horizon continuous MDP $$\mathcal M = \{\mathbb R^{n_s},\mathbb R^{n_a}, f, c, H, \mu_0\}$$

By Sarah Dean

Private