When Online Learning
joint work with Nick Harvey and Christopher Liaw
meets Stochastic Calculus
Victor Sanches Portella
cs.ubc.ca/~victorsp
Prediction With Expert's Advice
Prediction with Expert's Advice
Player
Adversary
\(n\) Experts
0.5
0.1
0.3
0.1
Probabilities
1
-1
0.5
-0.3
Costs
Player's loss:
Adversary knows the strategy of the player
Performance Measure - Regret
Loss of Best Expert
Player's Loss
Optimal!
For random \(\pm 1\) costs
Multiplicative Weights Update:
(Hedge)
Why Learning with Experts?
Boosting in ML
Understanding sequential prediction & online learning
Universal Optimization
TCS, Learning theory, SDPs...
Quantile Regret
Best Expert
Best Experts
\(\varepsilon\)-fraction
MWU:
Needs knowledge of \(\varepsilon\)
We design an algorith with \(\sqrt{T \ln(1/\varepsilon)}\) quantile regret
for all \(\varepsilon\) and best known leading constant
Loss of
top \(\varepsilon n \) expert
\(\varepsilon\)-Quantile Regret
Continuous OL via Stochastic Calculus
Algorithms design guided by
PDEs and (Stochastic) Calculus tools
Main Goal of this Talk: describe the main ideas of the
continuous time model and tools
Continuous Experts' Problem
Modeling Online Learning in Continuous Time
Analysis often becomes clean
Sandbox for design of optimization algorithms
Gradient flow is useful for smooth optimization
Key Question: How to model non-smooth (online) optimization in continuous time?
Why go to ?
continuous time
Modeling Adversarial Costs in Continuous Time
Total loss of expert \(i\):
Useful perspective: \(L(i)\) is a realization of a random walk
realization of a Brownian Motion
Probability 1 = Worst-case
Discrete Time
Continuous Time
The Continuous Time Model
Discrete time
Continuous time
Cummulative loss
Player's cummulative loss
Player's loss per round
[Freund '09]
Regret Vector
Regret
MWU in Continuous Time
Potential based players
Multiplicative Weights Update
LogSumExp
NormalHedge
First algorithm for quantile regret
Very clean Continuous time analysis
[Freund '09]
A Peek Into the Analysis
Ito's Lemma
(Fundamental Theorem of Stochastic Calculus)
\(B(t)\) is very non-smooth \(\implies\) second-order terms matter
Ito's Lemma
Idea: Pick \(\Phi\) as to make Ito's Lemma simpler for
Idea: Use stochastic calculus to guide the algorithm design
Potential based players
Smooth
Non-smooth
A Peek Into the Analysis
Potential based players
For all \(\varepsilon\)
Ito's Lemma suggests \(\Phi\) that satisfy the Backwards Heat Equation
Using this potential*, we get
Best leading constant
Discrete time analysis is IDENTICAL to continuous time analysis
Discrete Ito's Lemma
*(with a slightly bigger cnst. in the BHE)
Other Results Using
Stochastic Calculus
Fixed Time vs Anytime Regret
Question:
Are the minimax regret with and without knowledge of \(T\) different?
fixed-time
anytime
[Harvey, Liaw, Perkins, Randhawa '23]
n = 2
anytime
fixed-time
[Cover '67]
Back. Heat Eq.
Efficient version via SC
<
[Greenstreet, VSP, Harvey '20]
Heat Eq.
?
In Continuous Time, both are equal if Brownian Motions are independent.
[VSP, Liaw, Harvey '22]
Large n
What about expected regret?
Question:
What is the expected regret in the anytime setting
even without idependent experts?
[VSP, Liaw, Harvey '22]:
High expected regret \(\implies\) lower bound
In the language of martingales:
Nearly tight bounds.
asymptotically!
For a martingale \(X_t\), find upper and lower bounds to
sup
is a stopping time
Evidence that
anytime = fixed-time
Online Linear Optimization
Player
Adversary
Unconstrained
Linear functions
Player's loss:
Loss of Fixed \(u\)
Player's Loss
Parameter-Free Online Linear Optimization
Goal:
No knowledge of \(\lVert u \rVert\)
Small regret if \(\lVert g_t\rVert\) small
[Zhang, Yang, Cutkosky, Paschalidis '24]:
Parameter-free and Adaptive algorithm
Backwards Heat Equation
Parameter free and adaptive algorithms matching lower bounds
(even up to leading constant)
Pontential based player satisfying
+ refined discretization
Conclusion and Open Questions
Continuous Time Model for Experts and OLO
Thanks!
[VSP, Liaw, Harvey '22] Continuous prediction with experts' advice.
[Zhang, Yang, Cutkosky, Paschalidis '24] Improving adaptive online learning using refined discretization.
[Freund '09] A method for hedging in continuous time.
[Harvey, Liaw, Perkins, Randhawa '23] Optimal anytime regret with two experts.
[Greenstreet, VSP, Harvey '22] Efficient and Optimal Fixed-Time Regret with Two Experts
[Harvey, Liaw, VSP '22] On the Expected infinity-norm of High-dimensional Martingales
Improve LB for anytime experts? Or better upper-bounds?
?
High-dim continuous time OLO?
?
Hopefully this model can be helpful in more developments in OL and optimization!
Application to offline non-smooth optimization?
?
When Online Learning
joint work with Nick Harvey and Christopher Liaw
meets Stochastic Calculus
Victor Sanches Portella
cs.ubc.ca/~victorsp
Motivating Problem - Fixed Time vs Anytime
MWU regret
when \(T\) is known
when \(T\) is not known
anytime
fixed-time
Does knowing \(T\) gives the player an advantage?
[Harvey, Liaw, Perkins, Randhawa '23]
Continuous anytime algorithms for independent experts
[VSP, Liaw, Harvey '22]
Optimal lower bound 2 experts + optimal algorithm
+ improved algorithms for quantile regret!
With stochastic calculus:
MWU in Continuous Time
Potential based players
MWU!
Same regret bound as discrete time!
Idea: Use stochastic calculus to guide the algorithm design
LogSumExp
Regret bounds
when \(T\) is known
when \(T\) is not known
anytime
fixed-time
with prob. 1
The Joys of Stochastic Calculus
+ better anytime algorithms in continuous time
[Zhang, Yang, Cutkosky, Paschalidis '24]
Optimal anytime lower bound 2 experts + optimal algorithm
Best known algorithms for quantile regret
[Harvey, Liaw, Perkins, Randhawa '23]
Efficient optimal algorithms for fixed time 2 experts
[Greenstreet, VSP, Harvey '20]
Optimal parameter-free algorithms for online linear optimization
[VSP, Liaw, Harvey '22]
Simple continuous time analysis of NormalHedge
[Freund '09]
MWU in Continuous Time
Potential based players
MWU!
Same regret bound as discrete time!
Idea: Use stochastic calculus to guide the algorithm design
LogSumExp
Regret bounds
when \(T\) is known
when \(T\) is not known
anytime
fixed-time
with prob. 1
A Peek Into the Analysis
Potential based players
Matches fixed-time!
Ito's Lemma suggests \(\Phi\) that satisfy the Backwards Heat Equation
This new anytime algorithm has good regret!
Does not translate easily to discrete time
need correlation between experts
Take away: Anytime lower bounds for (continuous) experts
need dependent experts
A One Dimensional Continuous Time Model
Discrete Regret
Continuos Regret
Theorem:
If \(\Phi\) satisfies the BHE and
Going to higher dim:
Continuous time analogue
of
Learn direction and scale separately
Use refined discretization
Discretizing:
Why Continuous Time?
Beyond i.i.d. Experts
Discrete time analysis is IDENTICAL to continuous time analysis
Discrete Ito's
Lemma
Improved anytime algorithms with bounds
quantile regret
Design guided by continuous time setting
ISMP
By Victor Sanches Portella
ISMP
- 87