Ishanu Chattopadhyay PRO
ML Data Science Biomedicine Social Science Faculty
AI in Medicine:
From Test-free Screening of Complex Diseases
to
Understanding Microbiomes, Self-organization and Zoonotic Emergence
Ishanu Chattopadhyay, PhD
Assistant Professor of Medicine
University of Chicago
ishanu@uchicago.edu
first wave
rule-based systems
second wave
Big Data / ML / Deep Learning
recognize patterns, make predictions, might improve over time, but struggle on tasks not trained for
third wave
contextual reasoning, generlizable, towards true intelligence
Rotaru, Victor, Yi Huang, Timmy Li, James Evans, and Ishanu Chattopadhyay. "Event-level prediction of urban crime reveals a signature of enforcement bias in US cities." Nature human behaviour 6, no. 8 (2022): 1056-1068.
mathematics
computer science
social science
medicine
AI/ML learning theory and applications
Complex systems
Implication of AI in Future of Societay
University of Chicago Medicine
The Laboratory for Zero Knowledge Discovery
collaborators
Alex Leow
Psychiatry UIC
Anna Podolanczuk, Pulmonary Care, Weill Cornell
Gary Hunninghake, Pulmonary C, Harvard
Robert Gibbons, Bio-statistics
Daniel Rubins, Anesthesia and Critical Care
Peter Smith, Pediatrics
Michael Msall Pediatrics
Fernando Martinez, Pulmonary Critical Care, Weill Cornell
James Mastrianni, Neurology
James Evans, sociology
Erika Claud, Pediatrics
Aaron Esser-Kahn Molecular Engineering
David Llewellyn
University of Exeter
Kenneth Rockwood
Dalhousie University
Andrew Limper Mayo Clinic
zed.uchicago.edu
Dr. Shahab Asoodeh
Dr. Yi Huang
Dmytro Onishenko
Victor Rotaru
Jin Li
Ruolin Zhang
David Yang
Dr. Nicholas Sizemore
Drew Vlasnik
Lucas Mantovani
Jaydeep Dhanoa
Jasmine Mithani
Angela Zhang
Warren Mo
Kevin Wu
zed.uchicago.edu
Department of Pediatrics
UChicago
Department of Neurology & The Memory Center
UChicago
Department of Psychiatry
UChicago
Pulmonary Critical Care, Weill Cornell
Department of Anesthesia and Critical Care
UChicago
Center for Health Statistics
UChicago
Pulmonary Critical Care, Harvard Medical School
Department of Psychiatry
UIC
Demon Network, Exeter, Alan Turing Institute, UK
Dalhousie University, Canada
Pritzker School of Molecular ENgineering
Social Science
UChicago
zed.uchicago.edu
D3M (I2O)
PAI (DSO)
PREEMPT (BTO)
YFA (DSO)
NIA
ACT 1
point-of-care screening for complex diseases
Can we use existing EHR to reliably screen for complex diseases such as pulmonary fibrosis, dementia and rare cancers?
Ai
Electronic Healthcare Record
IPF
ASD
ADRD
Onishchenko, Dmytro, Robert J. Marlowe, Che G. Ngufor, Louis J. Faust, Andrew H. Limper, Gary M. Hunninghake, Fernando J. Martinez, and Ishanu Chattopadhyay. "Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records." Nature Medicine 28, no. 10 (2022): 2107-2116.
Universal screening for complex diseases
Primary Care
EHR
Test-free screening for complex diseases
AI
ACT II
Can We Model Ecosystems As They Evolve ?
Can we predict future mutations?
Digital Twins for complex systems
Can we find generative models for microbiome dynamics?
ACT I
Universal Screening?
Is AI/ML adding anything of relevance?
"predicting" autism > 3yrs
"diagnosing" fibrosis from lung imaging
"diagnosing" dementia from brain scan
Rapid Universal Point-of-care Screening for ILD/IPF Using Comorbidity Signatures in Electronic Health Records
Flag patients before they (or doctors) suspect
Primary Care
Pulmonologist
?
Zero-burden Co-morbid Risk Score (ZCoR)
shortness of breath
dry cough
doctor can hear velcro crackles
Common Symptoms
>50 years old
more men than women
IPF
Rare disease
~5 in 10,000
Post-Dx
Survival
~4 years
At least one misdiagnosis
~55%
Two or more misdiagnosis
38%
Initially attributed to age related symptoms:
72%
Cannot always be seen on CXR
Non-specific symptoms
PCP workflow demands
~ 4yrs
current survival ~4yrs
~ 4yrs
current clinical DX
ZCoR screening
Onishchenko, D., Marlowe, R.J., Ngufor, C.G. et al. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat Med 28, 2107–2116 (2022). https://doi.org/10.1038/s41591-022-02010-y
n=~3M
AUC~90%
Likelihood ratio ~30
Conventional AI/ML attempts to model the physician
AI in IPF Research
ICD administrative codes
IPF
ILD
target codes appear
Past medical history
No target codes appear
case
control
2yrs
2yrs
prediction
target codes appear
Past medical history
No target codes appear
case
control
2yrs
2yrs
IPF drugs prescribed
Signature of IPF diagnostic sequence
pirfenidone or nintedanib
ICD Codes can be noisy
"cases" are not always true IPF
Truven MarketScan (IBM) Commerical Claims & Encounters Database 2003-2018
>100M patients visible
>7B individual claims
>87K unique diagnostic codes
>7% Medicare data present
2,053,277 patients included in study
University of Chicago Medical Center 2012-2021
68,658 patients
Random sample from Optumlabs Data Warehouse courtsey Mayo Clinic
861,280 patients
2,983,215 patients
Data: Onishchenko etal. Nat. Medicine 2022
Comorbidity Spectra
patient A
patient B
patient C
lesson 1
Beyond "risk factors" to personalized risk patterns
False Positives:
Ethics:
For every 20-30 flags,
1 is positive
minimal
acceptable?
Better outcomes
Collard, Harold R., Alex J. Ward, Stephan Lanes, D. Cortney Hayflinger, Daniel M. Rosenberg, and Elke Hunsche. "Burden of illness in idiopathic pulmonary fibrosis." Journal of medical economics 15, no. 5 (2012): 829-835.
Clinical Trial Cohort Selection
Current screen failure rate ~50-60%
ZCoR boosted screen failure rate ~20%
Off-the-shelf AI does not suffice
lesson 2
Modeling Longitudinal Patterns
Specialized HMM models from code sequences
Model control and case cohorts seprately
given a new test case, compute likelihood of sample arising from case models vs control models
sequence likelihood defect
Huang, Yi, Victor Rotaru, and Ishanu Chattopadhyay. "Sequence likelihood divergence for fast time series comparison." Knowledge and Information Systems 65, no. 7 (2023): 3079-3098.
ZeD Lab: Predictive Screening from Comorbidity Footprints
Nature Medicine
JAHA
CELL Reports
Science Adv.
1 in 59
Autism Spectrum Disorder
36
Autism Co-morbid Risk (ACoR) Score
Data: Onishchenko etal. Science Advances 2021
>5 Million in US. >13 Million in next 10 years
Alzheimer's Disease and Related Dimentia
MOCA, Blood Tests
Current Practice:
state of art with EHR:
~67% AUC*
ZCoR: ~87%
Application to Suicide Attempts and Ideation (SISA) , PTSD*:
perhaps surprising connection between mood disorders and physiological comorbidities
Gibbons RD, Kupfer D, Frank E, Moore T, Beiser DG, Boudreaux ED. Development of a Computerized Adaptive Test Suicide Scale-The CAT-SS. J Clin Psychiatry. 2017 Nov/Dec;78(9):1376-1382. doi: 10.4088/JCP.16m10922. PMID: 28493655.
* in press
The ZCoR Approch: Rapidly Re-targettable
ZED performance | Competition | |
---|---|---|
Autism | >80% AUC at 2 yrs | "obvious" |
Alzheimer's Disease | ~90% AUC | 60-70% AUC |
Idiopathic Pulmonary Fibrosis | ~90% AUC | NA |
MACE | ~80% AUC | ~70% AUC |
Bipolar Disorder | ~85% AUC | NA |
CKD | ~85% AUC | NA |
Cancers (Prostate, Bladder, Uterus, Skin) | ~75-80% AUC | Low |
Deploy all/many/most of these!
Application to Malignant Neoplasms
Melanoma
Melanoma has a high survival rate of over 90% when treated early. But if it progresses to later stages, the survival rate drops significantly. Identifying potentially life-threatening melanomas is crucial.
Medicine is poised to enter a transformative era, ushered by the emergence of sophisticated Artificial Intelligence (AI) models.
Enable more holistic approaches to medicine, where predictive patterns can be rapidly recognized and exploited
Uncovering A Digital Twin of the Maturing Human Microbiome
ACT II
Sizemore, Nicholas, Kaitlyn Oliphant, Ruolin Zheng, Camilia R. Martin, Erika C. Claud, and Ishanu Chattopadhyay. "A digital twin of the infant microbiome to predict neurodevelopmental deficits." Science Advances 10, no. 15 (2024): eadj0400.
ishanu chattopadhyay
Nicholas Sizemore
Kaitlyn Oliphant
Erika Claud
THE PROBLEM
Can microbial assay from gut actionably
pre-empt developmental markers?
Assuming a 1000 species ecosystem, and 1 successful experiment every day to discern a single two-way relationship, we would need 1,368 years to go through all possibilities. If we look for 3 way interactions, we would need 454,844 years
2019
PREEMPT
Can we predict the next pandemic?
Can we predict future mutations? Can we define the "edge of emergence"?
Digital Twins for complex systems
Chattopadhyay, Ishanu, Kevin Wu, Jin Li, and Aaron Esser-Kahn. "Emergenet: Fast Scalable Pandemic Risk Assessment of Influenza A Strains Circulating In Non-human Hosts." (2023). Under Review in Nature
PREEMPT
Q-Net
recursive forest
This is a general method!
Data
\(\downarrow \)
Set of interdependent
predictors
How do we measure "distance" between strains?
q-distance
a biologically informed, adaptive distance between strains
Smaller distances imply a quantitatively high probability of spontaneous jump
$$J \textrm{ is the Jensen-Shannon divergence }$$
Metric Structure
Tangent Bundle
geometry
dynamics
Sanov's Theorem & Pinsker's Inequality
Theorem
stable strain \(x_{h}\), "well-adapted" \(\Rightarrow Pr(x_h\rightarrow x_h) \approx 1 \)
For "new" strain \(x_{a}\), \( \displaystyle \theta(x_{a},x_{h}) \approx 0 \)
Assume:
Then, we have:
we can tell if new strain will adapt to humans
A Math Solution to a Hard Biological Problem
A Math Solution to a Hard Biological Problem
we can tell if new strain will adapt to humans
Influenza Risk Assessment Tool (IRAT) scoring for animal strains
Can we replicate IRAT scores*?
slow (months), quasi-subjective, expensive
*https://www.cdc.gov/flu/pandemic-resources/monitoring/irat-virus-summaries.htm
genomic analysis
receptor binding
animal
transmission
antivirals available
population immunity
human infections
animal
hosts
global prevalence
antigenic novelty
disease severity
Influenza Risk Assessment Tool (IRAT) scoring for animal strains
slow (months), quasi-subjective, expensive
*https://www.cdc.gov/flu/pandemic-resources/monitoring/irat-virus-summaries.htm
24 scores in 14 years
~10,000 strains collected annually
Emergenet: finding emergence risk of animal strains
Emergenet time: 1 second
BioNorad
Stamping Out the Next Pandemic **Before** The First Human Infection
Lets go back to the Microbiome Problem
<class>_<observation_time>
<actinobacteria>_<30wk>
<clostridia>_<28wk>
construct qnet
Q-net inferred with typical patients
Q-net inferred with patients with neurodevelopmental deficit
completely uninformative state
observed state
Think of microbiome profiles as states
completely uninformative state
observed
state
Q-net inferred with typical patients
Q-net inferred with patients with neurodevelopmental deficit
Risk of Time-stamped Microbial Profile to lead to Developmental Deficit
How different are the typical and deficit models?
Bacilli 30
typical
deficit
Coriobacteria 32
typical
deficit
Gammaproteobacteria 32
typical
deficit
All Patients
Feeding Variables added
Ability to "fill in" missing data is equivalent to making trajectory forecasts
Our risk measure is highly predictive and actionable
Which entities are most predictive?
Just add those microbes back?
No transplantation is guaranteed to work reliably
Predicted to reduce
risk reliably
Predicted to reduce
risk reliably
Supplantation MUST be personalized
Supplantation MUST be personalized
Supplantation MUST be personalized
Network Interpretations?
Typical
Deficit
Future
Answer the question: "what is a healthy microbiome?"
Explicit supplantation profiles that are tuned to individual ecosystems
Bioreactor experiments
What other problems can it solve?
Q-Nets
Digital Twins for complex systems
Mental health diagnosis
opinion dynamics
algorithmic lie detector
VeRITaAS
Can A Generative AI Tell if you Are Lying?
Vetting Response Integrity from
cross-Talk in Adversarial
Surveys
Hidden structure of cross-talk between responses to interview items
PTSD diagnostic interview
Q-Net
Number of possible responses
Minimum Performance (n=624)
Average Time: 3.5 min
No. of questions: 20
AUC > 0.95
PPV > 0.86
NPV > 0.92
At least 83.3% sensitivity at 94% specificity
Minimum AUC = \(0.95 \pm 0.005\)
Cannot be coached, or memorized
Datasets for training & validation
1. VA (n=294)
2. Prolific (n=300)
3. Psychiatrists (n=30)
Beat the test!
200 participants in
US
100 participants in
UK
30 forensic psychiatrists
10
6
1
Can-You-Fake-PTSD Challenge Results
successful attempts
Future
Vision
Transform bio-surveillance
Transform modeling of complex systems
Transform early diagnosis
Democratize AI unleashing its power for social good
ishanu chattopadhyay
ishanu@uchicago.edu
By Ishanu Chattopadhyay
Predictive modeling of crime and rare phenomena using fractal nets
ML Data Science Biomedicine Social Science Faculty