Online Convex Optimization
Learning, Duality, and Algorithms
Victor Sanches Portella
Advisor: Marcel K. de Carli Silva
IME - USP
May, 2019
Online Convex Optimization
Online Convex Optimization (OCO)
At each round
Player chooses a point
Enemy chooses a function
Player suffers a loss
SIMULTANEOUSLY
Player
Enemy
!
!
CONVEX
Player and Enemy see
Formalizing Online Convex Optimization
An Online Convex Optimization Problem
convex set
set of convex functions
Player
Enemy
Rounds
Expert's Problem
Player
Enemy
Experts
0.5
0.1
0.3
0.1
1
0
-1
1
Probabilities
Costs
Online Regression
Online Linear Regression
Player
Enemy
Regression Function
Query & Answer
Loss
Regret
Cost of always choosing
Goal: sublinear Regret
Player's Loss
Player Strategies
Sublinear regret under mild conditions
Focus of this talk: algorithms for the Player
Hupefully efficiently implementable
Algorithms We Shall See
Adaptive
FTRL
Cummulative Loss
Experts
Follow the Leader
Enemy
Player
UNSTABLE!
Adding Regularization
Enemy
Player
FTRL
Fixed Regularizer
Adding Adaptive Regularization
At round use regularizer
?
Regularizer Increment
Convex Function
AdaFTRL
Efficiently computable?
Not clear in general
Adaptive
Online Mirror Descent
Online (sub)Gradient Descent
Round
projection
Another Perspective
Representation of derivative
What is
?
direction
Online Gradient Descent Update
point
functional
(Riesz Repr. Theorem)
functional
functional
Directional derivative of at
Avoiding Inner-Product
What if we make other choices for ?
Mirror Maps
What if we make other choices for ?
strictly convex and differentiable on
For every
there is
such that
Bregman Projections onto attained by
Bregman Projector
Online Mirror Descent
Bregman
Projection
Dual
Primal
Adaptive?
Adaptive!
Adaptive Online Mirror Descent
First round
Round
for
Mirror Map Increments
Lazy Online Mirror Descent
Bregman
Projection
Classic Online Mirror Descent
First round
First round
For
For
Eager
Lazy
LOMD as FTRL
...
inside
outside
FTRL
EOMD as FTRL
inside
outside
EOMD as FTRL
EOMD as FTRL
inside
outside
FTRL
EOMD vs LOMD
Eager = Lazy
A Genealogy of
Algorithms
Connection Among the Main Algorithms
AdaReg
Second Order Algorithms?
A Bird's-eye View
Future Directions
Generalizations and Special Cases
Limited Feedback: Bandit, two-point Bandit feedback
Special Cases: Combinatorial, other specific settings
Player
Drop or Add Hypotheses: Convexity, adversarial enemies,
Hypercube
L2-Ball
Change Metric: Policy Regret, Raw Loss
side information
OCO in Other Areas
Quantum Computing
Approximately Maximum Flow
Robust Optimization
Competitive Analysis
Spectral Sparsification
SDP Solver
Oracle Boosting
Ideas
New Setting
Variational Perspective
Online Convex Optimization
Learning, Duality, and Algorithms
Victor Sanches Portella
Advisor: Marcel K. de Carli Silva
IME - USP
May, 2019
OCO - Defense
By Victor Sanches Portella
OCO - Defense
- 755