Sarah Dean PRO
asst prof in CS at Cornell
Prof. Sarah Dean
MW 2:45-4pm
255 Olin Hall
1. Recap: Multi-Armed Bandits
2. Explore-then-Commit
3. UCB Algorithm
4. UCB Analysis
A simplified setting for studying exploration
Multi-Armed Bandits
Multi-Armed Bandits
1. Multi-Armed Bandits
2. Explore-then-Commit
3. UCB Algorithm
4. UCB Analysis
Explore-then-Commit
\( \mu_{a} \in\left[ \hat \mu_{a} \pm c\sqrt{\frac{\log(K/\delta)}{N}}\right]\)
Explore-then-Commit (Interactive Demo)
1. Multi-Armed Bandits
2. Explore-then-Commit
3. UCB Algorithm
4. UCB Analysis
UCB
UCB
1. Multi-Armed Bandits
2. Explore-then-Commit
3. UCB Algorithm
4. UCB Analysis
\(\mu_\star - \mu_{a_t} \)
\(a_t\)
\(a_\star\)
Claim: sub-optimality at \(t\) is bounded by the width of \(a_t\)'s confidence interval
Explore-then-Commit
Upper Confidence Bound
For \(t=1,...,T\):
Explore for \(N \approx T^{2/3}\),
\(R(T) \lesssim T^{2/3}\)
\(R(T) \lesssim \sqrt{T}\)
Example: online advertising
Journalism
Programming
"Arms" are different job ads:
But consider different users:
CS Major
English Major
Example: online shopping
"Arms" are various products
But what about search queries, browsing history, items in cart?
Example: social media feeds
"Arms" are various posts: images, videos
Personalized to each user based on demographics, behavioral data, etc
By Sarah Dean