joint work with Nick Harvey (UBC) and Christopher Liaw (Google)
Victor Sanches Portella
ime.usp.br/~victorsp
Player
Adversary
\(n\) Experts
0.5
0.1
0.3
0.1
Probabilities
1
-1
0.5
-0.3
Costs
Player's loss:
Adversary knows the strategy of the player
Total player's loss
Can be = \(T\) always
Compare with offline optimum
Almost the same as Attempt #1
Restrict the offline optimum
Attempt #1
Attempt #2
Attempt #3
Loss of Best Expert
Player's Loss
Goal:
Idea: Pick the best expert at each round
where \(i\) minimizes
Can fail badly
Player loses \(T -1\)
Best expert loses \(T/2\)
\(\eta_t\): step-size at round \(t\)
\(\ell_t\): loss vector at round \(t\)
Sublinear Regret!
Optimal dependency on \(T\)
Can we improve the dependency on \(n\)?
Yes, and by a lot
Normalization
Optimal!
For random \(\pm 1\) costs
Multiplicative Weights Update:
(Hedge)
MWU is also Mirror Descent
Potential based players
LogSumExp
Boosting in ML
Understanding sequential prediction & online learning
Universal Optimization
TCS, Learning theory, SDPs...
Best Expert
Best Experts
\(\varepsilon\)-fraction
MWU:
Needs knowledge of \(\varepsilon\)
We design an algorith with \(\sqrt{T \ln(1/\varepsilon)}\) quantile regret
for all \(\varepsilon\) and best known leading constant
Loss of
top \(\varepsilon n \) expert
\(\varepsilon\)-Quantile Regret
Algorithms design guided by
PDEs and (Stochastic) Calculus tools
Main Goal of this Talk: describe the main ideas of the
continuous time model and tools
Analysis often becomes clean
Sandbox for design of optimization algorithms
Gradient flow is useful for smooth optimization
Key Question: How to model non-smooth (online) optimization in continuous time?
Why go to ?
continuous time
Total loss of expert \(i\):
Useful perspective: \(L(i)\) is a realization of a random walk
realization of a Brownian Motion
Probability 1 = Worst-case
Discrete Time
Continuous Time
Discrete time
Continuous time
Cummulative loss
Player's cummulative loss
Player's loss per round
[Freund '09]
Regret Vector
Regret
Goal: Prob. 1 bounds on Regret
Potential based players
Multiplicative Weights Update
LogSumExp
NormalHedge
First algorithm for quantile regret
Very clean Continuous time analysis
[Freund '09]
Ito's Lemma
(Fundamental Theorem of Stochastic Calculus)
\(B(t)\) is very non-smooth \(\implies\) second-order terms matter
Ito's Lemma
Idea: Pick \(\Phi\) as to make Ito's Lemma simpler for
Idea: Use stochastic calculus to guide the algorithm design
Potential based players
Smooth
Non-smooth
Using Ito's Lemma on potential \(\Phi(t, R_t)\) for 1 dimension*
\(=0 \) if \(p_t \propto \partial_x \Phi(t, R_t)\)
Potential does not change if this \(= 0\)
Ito's Lemma suggests \(\Phi\) that satisfy the Backwards Heat Equation
* Simplified, not quite correct
Using Ito's Lemma on potential \(\Phi(t, R_t)\) for \(d\) dimensions
\(=0 \) if \(p_t \propto \nabla_x \Phi(t, R_t)\)
"Covariance" of \(R_i\) and \(R_j\)
Do dependencies between \(L_i\) and \(L_j\) matter?
YES, and cannot (or hard?) to discretize otherwise
Different intuition from the discrete case (?)
Potential based players
For all \(\varepsilon\)
Ito's Lemma suggests \(\Phi\) that satisfy the Backwards Heat Equation
Using this potential*, we get
Best leading constant
Discrete time analysis is IDENTICAL to continuous time analysis
Discrete Ito's Lemma
*(with a slightly bigger cnst. in the BHE)
Question:
Are the minimax regret with and without knowledge of \(T\) different?
fixed-time
anytime
[Harvey, Liaw, Perkins, Randhawa '23]
n = 2
anytime
fixed-time
[Cover '67]
Back. Heat Eq.
Efficient version via SC
<
[Greenstreet, VSP, Harvey '20]
Heat Eq.
?
In Continuous Time, both are equal if Loss Processes are independent.
[VSP, Liaw, Harvey '22]
Large n
Question:
What is the expected regret in the anytime setting
even without idependent experts?
[VSP, Liaw, Harvey '25]:
High expected regret \(\implies\) lower bound
In the language of martingales:
Nearly tight bounds.
asymptotically!
For a martingale \(X_t\), find upper and lower bounds to
sup
is a stopping time
Evidence that
anytime = fixed-time
Player
Adversary
Unconstrained
Linear functions
Player's loss:
Loss of Fixed \(u\)
Player's Loss
Goal:
No knowledge of \(\lVert u \rVert\)
Small regret if \(\lVert g_t\rVert\) small
[Zhang, Yang, Cutkosky, Paschalidis '24]:
Parameter-free and Adaptive algorithm
Backwards Heat Equation
Parameter free and adaptive algorithms matching lower bounds
(even up to leading constant)
Pontential based player satisfying
+ refined discretization
Continuous Time Model for Experts and OLO
Thanks!
[VSP, Liaw, Harvey '22] Continuous prediction with experts' advice.
[Zhang, Yang, Cutkosky, Paschalidis '24] Improving adaptive online learning using refined discretization.
[Freund '09] A method for hedging in continuous time.
[Harvey, Liaw, Perkins, Randhawa '23] Optimal anytime regret with two experts.
[Greenstreet, VSP, Harvey '22] Efficient and Optimal Fixed-Time Regret with Two Experts
[Harvey, Liaw, VSP '22] On the Expected infinity-norm of High-dimensional Martingales
Improve LB for anytime experts? Or better upper-bounds?
?
High-dim continuous time OLO?
?
Hopefully this model can be helpful in more developments in OL and optimization!
Application to offline non-smooth optimization?
?
joint work with Nick Harvey (UBC) and Christopher Liaw (Google)
Victor Sanches Portella
ime.usp.br/~victorsp
Loss of Best Expert
Player's Loss
Optimal!
For random \(\pm 1\) costs
Multiplicative Weights Update:
(Hedge)
MWU regret
when \(T\) is known
when \(T\) is not known
anytime
fixed-time
Does knowing \(T\) gives the player an advantage?
[Harvey, Liaw, Perkins, Randhawa '23]
Continuous anytime algorithms for independent experts
[VSP, Liaw, Harvey '22]
Optimal lower bound 2 experts + optimal algorithm
+ improved algorithms for quantile regret!
With stochastic calculus:
Potential based players
MWU!
Same regret bound as discrete time!
Idea: Use stochastic calculus to guide the algorithm design
LogSumExp
Regret bounds
when \(T\) is known
when \(T\) is not known
anytime
fixed-time
with prob. 1
+ better anytime algorithms in continuous time
[Zhang, Yang, Cutkosky, Paschalidis '24]
Optimal anytime lower bound 2 experts + optimal algorithm
Best known algorithms for quantile regret
[Harvey, Liaw, Perkins, Randhawa '23]
Efficient optimal algorithms for fixed time 2 experts
[Greenstreet, VSP, Harvey '20]
Optimal parameter-free algorithms for online linear optimization
[VSP, Liaw, Harvey '22]
Simple continuous time analysis of NormalHedge
[Freund '09]
Potential based players
MWU!
Same regret bound as discrete time!
Idea: Use stochastic calculus to guide the algorithm design
LogSumExp
Regret bounds
when \(T\) is known
when \(T\) is not known
anytime
fixed-time
with prob. 1
Potential based players
Matches fixed-time!
Ito's Lemma suggests \(\Phi\) that satisfy the Backwards Heat Equation
This new anytime algorithm has good regret!
Does not translate easily to discrete time
need correlation between experts
Take away: Anytime lower bounds for (continuous) experts
need dependent experts
Discrete Regret
Continuos Regret
Theorem:
If \(\Phi\) satisfies the BHE and
Going to higher dim:
Continuous time analogue
of
Learn direction and scale separately
Use refined discretization
Discretizing: