0. Announcements & Recap

1. DAgger Algorithm

2. Online Learning

3. Analysis with PDL

5789 Paper Review Assignment (weekly pace *suggested*)

HW 3 due Monday 4/25

Final exam Monday 5/16 at 7pm

**Supervised Learning**

**Policy**

**Dataset of expert trajectory**

\((x, y)\)

...

**\(\pi\)( ) = **

**expert trajectory**

**learned policy**

No training data of "recovery" behavior

**query expert **

**learned policy**

**and append trajectory **

**retrain**

Idea: interact with expert to ask what they would do

**Supervised Learning**

**Policy**

**Dataset**

\(\mathcal D = (x_i, y_i)_{i=1}^M\)

...

**\(\pi\)( ) = **

**Execute**

**Query Expert**

\(\pi^*(s_0), \pi^*(s_1),...\)

\(s_0, s_1, s_2...\)

**Aggregate**

\((x_i = s_i, y_i = \pi^*(s_i))\)

[Pan et al, RSS 18]

Goal: map image to command

Approach: Use Model Predictive Controller as the expert!

\(\pi(\) \()=\) steering, throttle

