Key requirement
1. Data owners with similar data should receive similar valuation. 2. Data owners with unrelated data should receive low valuation.
Shapley value is a measure for players' contribution in a game.
Advantage
It satisfies many desired fairness axioms.
Drawback
Computing utilities requires retraining the model.
performance of the model
player i
utility created by players in S
marginal utility gain
[Wang et al.'20] propose to compute Shapley value in each communication round, which eliminates the requirement of retraining the model.
Fairness
Symmetry
Zero contribution
Addivity
model
number of clients
local dataset
loss function
[McMahan et al.'17]
Test data set (server)
Problem: In round t, the server only has
[Wang et al.'20]
Clients with identical local datasets may receive very different valuations.
Same local datasets
Relative difference
Empirical probability
Utility matrix
This matrix is only partially observed and we can do fair valuation if we can recover the missing values.
Theorem
If the loss function is smooth and strong convex, then
[Fan et al.'22]
[Udell & Townsend'19]
Same local datasets
Relative difference
Empirical CDF
local models
local embeddings
number of training samples
Only embeddings will be communicated between server and clients.
[Liu et al.'22]
Server selects a mini-batch
Each client m compute local embeddings
Server computes gradient
Each client m updates local model
Problem: In round t, the server only has
Embedding matrix
Theorem
If the loss function is smooth, then
[Fan et al.'22]