Huang Fang, Department of Computer Science
Supervisor: Michael P. Friedlander
October 13th, 2021
For
Different coordinate selection rules:
regularizer
Iter 1
Iter 2
Iter 3
Iter 4
Iter
data fitting
[FFSF, AISTATS'20]
We provide a theoretical characterization of GCD's screening ability:
for
Learning a sparse representation of an atomic set :
such that
Our contribution: how to identify the atoms with nonzero coefficients at solution during the optimization process.
[FFF21] Submitted
Play a game for rounds, for
The goal of online learning algorithm: obtain sublinear regret
player's loss
competitor's loss
Our contribution: fix the divergence issue of MD and obtain
regret.
[FHPF, ICML'20]
Primal
Dual
Bregman projection
Figure accredited to Victor Portella
[FHPF, ICML'20]
Primal
Dual
Bregman projection
}
With stabilization, OMD can obtain regret.
smooth
nonsmooth
Some discrepancies between theory and practice:
Our assumption:
where is a nonnegative, , convex, 1-smooth loss function, 's are Lipschitz continuous.
We prove
[FFF, ICLR'21]
Optimal
Advisor
Committee Members
Collaborators
University Examiners
External Reviewer
The GS-s rule:
The GS-r rule:
The GS-q rule:
where .
A key property:
The definition of
Norm inequality:
for all PSD matrix
The matrix AMGM inequality conjecture is false [LL20, S20])
where is the support function.
The primal-dual relationship:
This allows us to do screening base on dual variable:
Bregman divergence
Properties of mirror map
Examples:
smooth | nonsmooth | smooth+IC | nonsmooth + IC | |
---|---|---|---|---|
convex | ||||
strong cvx |
Assume , is -Lipschitz continuous and is nonnegative, convex, 1-smooth, minimum at 0, 1-dimensional function. Then
(the generalized growth condition)
High probability error bound
Conjecture: variance reduced SGD (SAG, SVRG, SAGA) + generalized Freedman inequality
Simple proof of exponential tail bound