Nov 14, 2025
Adam Wei
Data Protection
Data Misuse
Data Extraction
data!
1. Are privacy attacks on robot models possible?
2. How can we defend against these attacks?
Easier direction
Harder direction
Part 1
Anatomy of a Privacy Attack
Black box: Access to
White box: Access to
more access
\(\implies\)strong attacks
Black box: Access to
White box: Access to
more access
\(\implies\)strong attacks
Robotics?
Membership
Was the model trained on a specific datapoint?
Reconstruction
Can I reconstruct the training dataset?
Property
Given a partial datapoint, can I infer other properties?
Model Extraction
Can I extract the model parameters, etc?
Membership
Was the model trained on a specific datapoint?
Reconstruction
Can I reconstruct the training dataset?
Property
Given a partial datapoint, can I infer other properties?
Model Extraction
Can I extract the model parameters, etc?
Black box: Access to
White box: Access to
Deep Leakage from Gradients
Generative Model Inversion
Part 2
Deep Leakage from Gradients (DLG)
Central Server
\(W_t\)
\(W_t\)
\(\nabla_{W_t} \mathcal{L}_{1}\)
\(\nabla_{W_t} \mathcal{L}_{2}\)
\(W_{t+1} = W_t -\frac{\eta}{N}\sum_i\nabla_{W_t} \mathcal L_i\)
Goal: reconstruct datapoints that would produce the same gradient
Algorithm: given \(W\), \(\nabla W\), \(f_W\):
Initialize \((x',y')\sim \mathcal N(0,I)\)
Repeat until convergence:
\((x', y') \leftarrow (x', y') - \eta \nabla_{(x', y')} D\)
Solve with L-BFGS
(avoids Hessian computation)
\(D = \lVert \nabla W' - \nabla W \rVert_2^2\)
\(\nabla W' = \nabla_W \mathcal L(f_W(x'), y')\)
iDLG: Exploit classification structure to find ground truth labels
Many other variants...
ex. optimize \(z\) instead of \(x'\), where \(x' = G(z)\)
Defense: Adding zero-mean noise to gradients
Part 3
Generative Model Inversion
1. Train a WGAN (with some extra tricks...)
2. Let \(C(x)\) be the likelihood of the desired output given \(x\)
3. Solve
Realism
Likelihood of target output
4. \(\hat x = G(\hat z)\)
Part 4
Robotics & Generative Modeling
Conditioning variables are known;
outputs are sensitive
(subset) of conditioning variables are sensitive!
Outputs (may be) known;
Approaches