Ishanu Chattopadhyay

University of Chicago

Machine Learning & Advanced Analytics for Biomedicine

CCTS 40500 / CCTS 20500 / BIOS 29208
Winter 2023

Lecture 2

Today's Take-Home Message

Performance Metrics

Diagnostic Tests

Bayesian Statistics

Diagnostic Tests for Diseases

Risk Factors
- Past Diagnoses
Laboratory Tests
Questionnaire
Familial Risks
Life Events

Does the patient have the disorder?

Not Always Obvious

autism

dementia

Diagnostic Tests for Diseases

Risk Factors
- Past Diagnoses
Laboratory Tests
Questionnaire
Familial Risks
Life Events

Does the patient have risk of the disorder ?

Not Always Obvious

autism

dementia

How do we quantify risk?

How do we map risk to severity?

Diagnostic Tests

Sensitivity & Specificity

Confusion Matrix with 2 classes

Performance Metrics

Relationships between Performance Metrics

TPR = \frac{t_p}{P} = \frac{t_p}{t_p+f_n}\\ TNR = \frac{t_n}{N} = \frac{t_n}{t_n+f_p}\\ FPR =1-TNR\\ PPV =\frac{t_p}{t_p+f_p}\\ \rho =\frac{P}{N+P}

t_p : \textrm{ true positives }, t_n: \textrm{ true negatives }

f_p : \textrm{ false positives }, f_n: \textrm{ false negatives }

Relationships between Performance Metrics

PPV = \frac{t_p/P}{t_p/P + (f_p/N)(N/P)} = \frac{TPR}{\rho + ((N-t_n)/N)(N/P)}

t_p : \textrm{ true positives }, t_n: \textrm{ true negatives }

f_p : \textrm{ false positives }, f_n: \textrm{ false negatives }

s : \textrm{ sensitivity }, c: \textrm{ specificity }

NPV = \frac{1}{1+ \frac{1-s}{c \left ( \frac{1}{\rho}-1\right )} }

PPV = \frac{s}{s + (1-c)(\frac{1}{\rho} -1)}

Relationships between Performance Metrics

PPV = \frac{t_p/P}{t_p/P + (f_p/N)(N/P)} = \frac{TPR}{\rho + ((N-t_n)/N)(N/P)}

t_p : \textrm{ true positives }, t_n: \textrm{ true negatives }

f_p : \textrm{ false positives }, f_n: \textrm{ false negatives }

s : \textrm{ sensitivity }, c: \textrm{ specificity }

NPV = \frac{1}{1+ \frac{1-s}{c \left ( \frac{1}{\red \rho}-1\right )} }

PPV = \frac{s}{s + (1-c)(\frac{1}{\red \rho} -1)}

prevalence is intrinsic property of the disease

Relationships between Performance Metrics

NPV = \frac{1}{1+ \frac{1-s}{c \left ( \frac{1}{\red \rho}-1\right )} }

PPV = \frac{s}{s + (1-c)(\frac{1}{\red \rho} -1)}

Manic Episode with no Bipolar history

prevalence: ~10%

Relationships between Performance Metrics

NPV = \frac{1}{1+ \frac{1-s}{c \left ( \frac{1}{\red \rho}-1\right )} }

PPV = \frac{s}{s + (1-c)(\frac{1}{\red \rho} -1)}

Idiopathic Pulmonary Fibrosis

prevalence: ~0.5%

Relationships between Performance Metrics

The decision threshold is upto us to decide

Impacts sensitivity & specificity

Sensitivity Specificity Tradeoff

Each choice of a threshold produces a different test

Comparing Tests

Why is a "diagonal ROC" useless?

s=c \\ \Rightarrow \frac{t_p}{P} = \frac{t_n}{N} \\ \Rightarrow \frac{t_p}{t_n} = \frac{P}{N} = \frac{\wp}{1-\wp}

Let sensitivity be $s$, specificity be $c$, and prevalence P/(N+P) be $\wp$.

Then:

Hence, s=c is NO BETTER than a coin toss!

t_n

Comparing Tests

See papers 1-4 in https://github.com/zeroknowledgediscovery/course_notes/tree/master/paper_arxiv

AUC only considers ranks, not actual values
Related to the Mann-
Whitney U Test
Shows why AUC is immune to class imbalence

HW.

For 2 random samples, AUC is the probability that the positive sample is ranked higher than the negative one

Tests are tools to reduce uncertainty

Test Effectiveness

-LR=\frac{f_n}{t_n} \times \frac{1-\rho}{\rho} =\frac{1-s}{c}

+LR=\frac{t_p}{f_p} \times \frac{1-\rho}{\rho} =\frac{s}{(1-c) }

Prove this using Bayes' Theorem

Test Effectiveness

$$t_p/f_p$$

$$\frac{\rho}{1-\rho}$$

Test Effectiveness

Choosing Thresholds

Balancing False Positives & False Negatives

Cost	Positive	Negative
Test Positive	$0	$x
Test Negative	$y	$0

Cost Optimization to choose operating point

\textrm{minimize } \zeta = C(f_p)+C(f_n)

Criminal Justice: $$C(f_n) = 0 $$

Healthcare (Covid test?)

$$C(f_p) = 0 $$

naive dichotomy

Choosing Thresholds

Overlapping features are harder to classify

How do we formalize these trade-offs?

https://asm.org/Articles/2020/November/SARS-CoV-2-Testing-Sensitivity-Is-Not-the-Whole-St

Covid tests are similar

What happens if we test again?

0.045

1-0.045

0.69

But confirmatory tests might not be always feasible

Summary of Bayesian Inference

(H)

Maximum Likelihood Estimate

vs

Maximum a posteriori probability Estimate

\theta_{MLE} = \argmax_\theta Pr(X \vert \theta)

\theta_{MAP} = \argmax_\theta Pr(\theta \vert X) \\ = \argmax_\theta \bigg ( \log P(X \vert \theta) + \log Pr(\theta) \bigg )

HW: Show that the second expression is true

HW: 1. Why do we choose the Beta Distribution?

2. Choose a different prior and compute MAP estimate

Note on beta distribution:

E[X] = \frac{\alpha}{\alpha + \beta}

HW: Why choose conjugate priors?

Example of Computing A Bayes Estimator

Bayes' Error

The Universal Metric

Also not computable

HW will be posted on canvas.

Extra Credit Problem: Derive the posterior distribution using this approach

Copy of Copy of CCTS 405000-02-2023

By Ishanu Chattopadhyay

Copy of Copy of CCTS 405000-02-2023

Machine Learning for Biomedicine

Ishanu Chattopadhyay PRO

ML | Data Science Biomedical Informatics | Social Science | Assistant Professor

Machine Learning & Advanced Analytics for Biomedicine

Today's Take-Home Message

Diagnostic Tests for Diseases

Diagnostic Tests for Diseases

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Diagnostic Tests

Sensitivity & Specificity

Confusion Matrix with 2 classes

Performance Metrics

Relationships between Performance Metrics

Relationships between Performance Metrics

Relationships between Performance Metrics

Relationships between Performance Metrics

Relationships between Performance Metrics

Relationships between Performance Metrics

Sensitivity Specificity Tradeoff

Sensitivity Specificity Tradeoff

Sensitivity Specificity Tradeoff

Sensitivity Specificity Tradeoff

Comparing Tests

Comparing Tests

Comparing Tests

Why is a "diagonal ROC" useless?

Comparing Tests

Comparing Tests

Tests are tools to reduce uncertainty

Test Effectiveness

Test Effectiveness

Test Effectiveness

Test Effectiveness

Test Effectiveness

Test Effectiveness

Test Effectiveness

Choosing Thresholds

Balancing False Positives & False Negatives

Choosing Thresholds

Overlapping features are harder to classify

How do we formalize these trade-offs?

Summary of Bayesian Inference

Maximum Likelihood Estimate

vs

Maximum a posteriori probability Estimate

HW: 1. Why do we choose the Beta Distribution?

2. Choose a different prior and compute MAP estimate

HW: Why choose conjugate priors?

Example of Computing A Bayes Estimator

Bayes' Error

Extra Credit Problem: Derive the posterior distribution using this approach

Copy of Copy of CCTS 405000-02-2023

Copy of Copy of CCTS 405000-02-2023

Ishanu Chattopadhyay PRO

More from Ishanu Chattopadhyay