model
number of clients
local dataset
loss function
Primal problem
Dual problem
conjugate function
local dual model
FedAvg [McMahan et al.'17]
SCAFFOLD [Karimireddy et al.'20]
A extension of [Necoara et al.'17]
Each selected client computes dual gradient and upload to server
Server adjusts the gradients (to keep feasibility) and broadcasts to selected clients
Each selected client locally updates the dual model
Strong convexity
Smoothness
Data heterogeneity
Theorem
[Necoara et al.'17]
and
[Fan et al.'22]
Each selected client approximately computes dual gradient and upload to server
Theorem
and
[Fan et al.'22]
with
Random coordinate descent with Nesterov's acceleration has been widely studied [Nestrov'12; Lee & Sidford'13; Allen Zhu et al.'16; Lu et al.'18]
Only applied for unconstrained problem. We extend [Lu et al.'18] to linear constrained problem.
Theorem
and
[Fan et al.'22]
with
i.i.d
non-i.i.d