Filter Out Unwanted Informtion in Data

Wenjie Zheng

29 September 2022

@Oxford

Name Var1 Var2
Bruce Wayne AAA BBB
Peter Parker CCC DDD
Wenjie Zheng EEE FFF
... ... ...
Name Var1 Var2
Batman AAA BBB
Spider-Man CCC DDD
... ... ...
Name Var1 Var2
Poker EEE FFF
... ... ...

Census

Villain

Justice League

Scenario 1

Income Expense Debt Score
XXXX 1234.5 0 ?
XXXXX 9876 333 ?

Private

Scenario 2

Gender Smoking Insurance Premium
Male No ?
Female Yes ?

Unfair

Scenario 3

Individual

Private

Unfair

Non-distributional

Distributional

Differential Privacy

Privacy Funnel

A Flawed Method

Income Expense Debt
XXXX 1234.5 0
XXXXX 9876 333
Gender Smoking
Male No
Female Yes
Income Expense Debt
XXXX 1234.5 0
XXXXX 9876 333
S
X
I(S; Y) = 0
Y
\max_Y I(X; Y)
\max_Y I(X; Y) - \beta I(S;Y)
\max_Y I(X; Y) - \beta I(S;Y)
  • Discrete: submodular optimization
  • Gaussian: semi-definite programming
  • Continuous: variational method

\((S, X)\) : Gaussian...

Y := X - \text{proj}_S(X)
Y := X - S S^+ X
I(S; Y) = 0
\max_Y I(X; Y)
I(S;Y|Z) = 0
Y := X - S S^+ X
I(S; Y) = 0
Y:= ?

Working on

Oxford

By Wenjie Zheng

Oxford

  • 334