CombΘ Seminar
Victor Sanches Portella
November, 2024
cs.ubc.ca/~victorsp
Postdoc supervised by prof. Yoshiharu Kohayakawa
Interests
ML Theory
Optimization
Randomized Algs
Optimization
Informal Goal: Output should not reveal (too much) about any single individual
Not considering protections against security breaches
Output
Data Analysis
Output should have information about the population
This has more to do with "confidentiality" than "privacy"
Summary: License plates were anonymized using MD5
Easy to de-anonymize due to lincense plate structure
By Vijay Pandurangan
https://www.vijayp.ca/articles/blog/2014-06-21_on-taxis-and-rainbows--f6bc289679a1.html
Privacy is quite delicate to get right
Hard to take into account side information
"Anonymization" is hard to define and implement properly
Different use cases require different levels of protection
Output 1
Output 2
Indistinguishible
Differential Privacy
Anything learned with an individual in the dataset
can (likely) be learned without
\(\mathcal{M}\) needs to be randomized to satisfy DP
Adversary with full information of all but one individual can infer membership
Any pair of neighboring datasets: they differ in one entry
\(\mathcal{M}\) is \((\varepsilon, \delta)\)-Differentially Private if
Definition:
\((\varepsilon, \delta)\)-DP
\(\varepsilon \equiv \) "Privacy leakage", in theory constant \(\leq 1\)
\(\delta \equiv \) "Chance of failure", usually \(o(1/|X|)\)
Worst case: No assumptions on the adversary
Immune to post-processing: Any computation on the output can only improve the privacy guarantees
Composable: DP guarantees of different algorithms compose nicely, even if done in sequence and adaptively
Online Learning
Adaptive Data Analysis and Generalization in ML
Robust statistics
Proof uses Ramsey's Theory :)
Goal:
is small
\((\varepsilon, \delta)\)-DP such that approximates the mean:
Algorithm:
Gaussian or Laplace noise
with
Goal:
is small
\((\varepsilon, \delta)\)-DP such that approximates the mean:
Algorithm:
Gaussian or Laplace noise
with
OPTIMAL?
Theorem
\(Z \sim \mathcal{N}(0, \sigma^2 I)\) with
\(\mathcal{M}\) is \((\varepsilon, \delta)\)-DP and
Assume \(\mathcal{M}\) is
accurate
Adversary can detect some \(x_i\)
with high probability
Feed to \(\mathcal{M}\) a marked input \(X\)
\((\varepsilon,\delta)\)-DP implies adversary detects \(x_i\) on \(\mathcal{M}(X')\) with
\(X' = X - \{x_i\} + \{z\}\)
CONTRADICTION
Movie may leak!
Movie Owner
Can we detect one ?
?
Idea: Mark some of the scenes (Fingerprinting)
\(d\) scenes
\(n\) copies of the movie
1 = marked scene
0 = unmarked scene
Code usually randomized
We can do with \(d = 2^n\). Can \(d\) be smaller?
Example of pirating:
\(0\) or \(1\)
\(0\) or \(1\)
Only 1
Goal of fingerprinting
Given a copy of the movie, trace back one
with probability of
false positive \(o(1/n)\)
Assume \(\mathcal{M}\) is
accurate
Adversary can detect some \(x_i\)
with high probability
Feed to \(\mathcal{M}\) a marked input \(X\)
\((\varepsilon,\delta)\)-DP implies adversary detects \(x_i\) on \(\mathcal{M}(X')\) with
\(X' = X - \{x_i\} + \{z\}\)
CONTRADICTION
FP codes with \(d = \tilde{O}(n^2)\)
Output -> Pirated Movie
Breaks False Positive Guarantee
[Tardos '08]
The Ugly:
Black-box use of FP codes makes it hard to adapt it to
other settings
The Bad:
Very restricted to binary inputs
The Good:
Leads to optimal lower bounds for a variety of problems
Idea: For some distribution on the input,
the output is highly correlated with the input
Lemma (A 1D Fingerprinting Lemma, [Bun, Stein, Ullman '16])
\(\mathcal{M} \colon [-1,1]^n \to [-1,1]\)
\(p \sim \mathrm{Unif}(\{-1,1\})\)
\(x_1, \dotsc, x_n \in \{\pm 1\}\) random such that \(\mathbb{E}[x_i] = p\)
"Correlation" between \(x_i\) and \(\mathcal{M}(X)\)
\(\mathcal{A}(x_i, \mathcal{M}(X))\)
If \(\mathcal{M}\) is accurate
large
If \(z\) indep. of \(X\)
small
Depends on distribution of \(X\) and \(p\)
Fingerprinting Lemma leads to a kind of fingerprinting code
Bonus: quite transparent and easy to describe
Key Idea: Make \(\tilde{O}(n^2)\) independent copies
\(\mathcal{M} \colon ([-1,1]^d)^n \to [-1,1]\)
\(p \sim \mathrm{Unif}(\{-1,1\})^d\)
\(x_1, \dotsc, x_n \in \{\pm 1\}^d\) random such that \(\mathbb{E}[x_i] = p\)
for \(d = \Omega(n^2 \log n)\)
\(\mathcal{A}(x_i, \mathcal{M}(X))\)
If \(\mathbb{E}(\lVert \mathcal{M}(X) - p\rVert_2^2) \leq d/6\)
If \(\mathcal{M}\) is accurate, correlation is high
If \(\mathcal{M}\) is \((\varepsilon, \delta)\)-DP, correlation is low
\(\mathcal{A}(x_i, \mathcal{M}(X))\)
Lemma (Gaussian Fingerprinting Lemma)
\(\mathcal{M}\colon \mathbb{R}^n \to \mathbb{R}\)
\(\mu \sim \mathcal{N}(0, 1/2)\)
\(x_1, \dotsc, x_n \sim \mathcal{N}(\mu,1)\)
One advantage of lemmas over codes:
Easier to extend to different settings
Implies similar lower bounds for privately estimating the mean of a Gaussian
Work done in collaboration with Nick Harvey
Unknown Covariance Matrix
\((\varepsilon, \delta)\)-differentially private \(\mathcal{M}\) to estimate \(\Sigma\)
on \(\mathbb{R}^d\)
Goal:
Required even without privacy
Required even for \(d = 1\)
Is this tight?
Exists \((\varepsilon, \delta)\)-DP \(\mathcal{M}\) such that
samples
Known algorithmic results
with
Unknown Covariance Matrix
on \(\mathbb{R}^d\)
To get a Fingerprinting Lemma, we need random \(\Sigma\)
Most FPLs are \(d = 1\), and then use independent copies
leads to limited lower bounds for covariance estimation
[Kamath, Mouzakis, Singhal '22]
We can use diagonally dominant matrices, but
0 has error \(O(1)\)
Can't lower bound accuracy of algorithms with \(\omega(1)\) error
Diagonal
Off-diagonal
Theorem
For any \((\varepsilon, \delta)\)-DP algorithm \(\mathcal{M}\) such that
and
we have
Our results covers both regimes
Nearly highest reasonable value
[Kamath et al. 22]
Previous lower bounds required
[Narayanan 23]
OR
Main Contribution: Fingerprinting Lemma without independence
Wishart Distribution
Our results use a very natural distribution:
\(d \times 2d\) random Gaussian matrix
Natural distribution over PSD matrices
Entries are highly correlated
Lemma (Gaussian Fingerprinting Lemma)
\(\mu \sim \mathcal{N}(0, 1/2)\)
\(x_1, \dotsc, x_n \sim \mathcal{N}(\mu,1)\)
Claim 1
Claim 2
Stein's Lemma
Follows from integration by parts
Fingerprinting Lemma
Need to Lower Bound
\(\Sigma \sim\) Wishart leads to elegant analysis
Stein-Haff Identity
"Move the derivative" from \(g\) to \(p\) with integration by parts
Stokes' Theorem
Differential Privacy is a mathematically formal definition of private algorithms
Interesting connections to other areas of theory
Fingerprinting Codes lead to many optimal lower bounds for DP
Fingerprinting Lemmas are more versatile for lower bounds and can be adapted to other settings
New Fingerprinting Lemma escaping the need to bootstrap a 1D result to higher dimensions with independent copies
Thanks!
CombΘ Seminar
Victor Sanches Portella
November, 2024
cs.ubc.ca/~victorsp