CS 4/5789: Introduction to Reinforcement Learning

Prof. Sarah Dean

MW 2:45-4pm
110 Hollister Hall

Agenda

0. Announcements & Recap

1. Motivation & Interactive Demo

2. Formal Setting

3. Balancing Exploration and Exploitation

HW2 due Monday 3/28

5789 Paper Review Assignment (weekly pace suggested)

Prelims graded within a week

MDPs, Policies, Distributions
Value and Q functions
Optimal Policies: VI, PI, DP, and LQR
Approximate policies & properties like stability, reachability, observations, robustness

Model-based RL: tabular & parametric settings
Learning Q functions: rollout & Bellman-based supervision, Conservative Policy Iteration
Policy Optimization: Random Search, REINFORCE, Actor-Critic, and Natural PG