1. Paper by Dror et al.
2. Gives basic introduction to significance testing in NLP
1. Parametric: distribution known beforehand
2. Non-parametric: distribution unknown
1. Paired student's t-test
1. Two types: sampling-based and sampling-free
2. Sampling-based: consider evaluation metric values
3. Sampling-free: do not consider metric values
1. Sign test
2. McNemar's test
3. Cochran's Q test
4. Wilcoxon signed-rank test
1. Pitman's permutation test
2. Paired bootstrap test
By Anjali Bhavan