CS 4/5789: Introduction to Reinforcement Learning

Lecture 9

Prof. Sarah Dean

MW 2:45-4pm
110 Hollister Hall

Agenda

 

0. Announcements & Recap

1. Types of Feedback

2. Supervised Learning

3. Estimation and Prediction

Announcements

 

Check participation grades: PollEV (out of 7) on Canvas

 

HW1 released Friday, due 3/7

 

5789: Papers posted on Canvas (suggestions welcome), 10 reviews due by last day of class (weekly pace)

 

Office hours after lecture M (110 Hollister) and W (416A Gates)

Recap: Unit 1

  • MDP and Optimal Control
    • States, Actions, Transitions/Dynamics, Reward/Cost, Discount Factor/Horizon
    • Value & Q functions, Bellman Equation, Policy Evaluation
  • Optimal policies: Value Iteration, Policy Iteration, Dynamic Programming, LQR
  • Approximate policies: Linearization and iLQR, PID
  • Other properties: Stability, Reachability, Observations, Robustness

CS 4/5789: Lecture 9

By Sarah Dean

Private

CS 4/5789: Lecture 9