CPSC 532S - Danica Sutherland
James, Amin, James
Spring 2022
P = Q?
Steps to the standard two-sample test:
1) determine a significance level as input to the test
2) manually compute the test statistic
3) compute p-value
4) reject the null hypothesis if our calculated p value is less than alpha, and fail to do so otherwise.
Shuffle D at random and split it into training and test subsets
"Revisiting Classifier Two-Sample Tests "[Lopez-Paz and Oquab, 2016]
Train a binary classifier
From here we develop the p value as is traditionally done in normal two sample tests
Thus: [https://cse.buffalo.edu/~hungngo/classes/2011/Fall-694/lectures/rademacher.pdf]
Implemented Logistic Regression in Pytorch
Tested on synthetic data sampled from 1D distributions
Trained for 200 epochs on 400 generated examples
Result: Reject the Null Hypothesis, P != Q
Mean: 0 std: 0.5
Mean: 0.8 std: 0.3
Result: Reject the Null Hypothesis, P = Q
Mean: 0 std: 0.5
Student T: DoF: 4 mean: 1.5
Result: Accept the Null Hypothesis, P = Q
Mean: 0 std: 0.5
Result: Accept the Null Hypothesis, P = Q
Gaussian: Mean: 0 std: 0.5
Student T: DoF: 4 mean: 0.1
Result: Accept the Null Hypothesis, P = Q
Gaussian: Mean: 0 std: 0.5
Student T: DoF: 4 mean: 0.1