Prof. Sarah Dean
MW 2:45-4pm
110 Hollister Hall
0. Announcements & Recap
1. Linear Contextual Bandits
2. Interactive Demo
3. LinUCB Algorithm
My office hours today are cancelled
Prelim corrections due tomorrow - please list collaborators
5789 Paper Review Assignment (weekly pace suggested)
HW 3 released tonight, due in 2 weeks
Final exam Monday 5/16 at 7pm
A simplified setting for studying exploration
Explore-then-Commit
Upper Confidence Bound
For t=1,...,T:
Set exploration N≈T2/3,
R(T)≲T2/3
R(T)≲T
A (less) simplified setting for studying exploration
ex - machine make an model affect rewards, so context x=(•,•,•,•,•,•,•,•)
Explore-then-Commit
Set exploration N≈T2/3,
we showed R(T)≲T2/3 using prediction error guarantees Ex∼D[∣μa(x)−μa(x)∣]
Set exploration N≈T2/3,
we showed R(T)≲T2/3 using prediction error guarantees Ex∼D[∣μa(x)−μa(x)∣]
For context-dependent confidence bounds, we need to understand
E[∣μa(x)−μa(x)∣∣x]
0. Announcements & Recap
1. Linear Contextual Bandits
2. Interactive Demo
3. LinUCB Algorithm