CS 4/5789: Introduction to Reinforcement Learning

Lecture 9

Prof. Sarah Dean

MW 2:45-4pm
110 Hollister Hall



0. Announcements & Recap

1. Types of Feedback

2. Supervised Learning

3. Estimation and Prediction



Check participation grades: PollEV (out of 7) on Canvas


HW1 released Friday, due 3/7


5789: Papers posted on Canvas (suggestions welcome), 10 reviews due by last day of class (weekly pace)


Office hours after lecture M (110 Hollister) and W (416A Gates)

Recap: Unit 1

  • MDP and Optimal Control
    • States, Actions, Transitions/Dynamics, Reward/Cost, Discount Factor/Horizon
    • Value & Q functions, Bellman Equation, Policy Evaluation
  • Optimal policies: Value Iteration, Policy Iteration, Dynamic Programming, LQR
  • Approximate policies: Linearization and iLQR, PID
  • Other properties: Stability, Reachability, Observations, Robustness

CS 4/5789: Lecture 9

By Sarah Dean


CS 4/5789: Lecture 9