Near-zero-knowledge Pattern Discovery for Universal Screening for Complex Disorders
Ishanu Chattopadhyay, PhD
Assistant Professor of Medicine
University of Chicago
ishanu@uchicago.edu
09.12.2023
Chattopadhyay Ishanu, Ph.D., faculty for this educational activity, is the founder for Zero Burden Labs, Inc., an advisor, for Adiona Health. He also receives funding from National Institute on Aging Alzheimer's Association DARPA Defense Sciences Office, Biological technologies Office. He has indicated that the presentation will not include off-label or unapproved product usage. All of the relevant financial relationships listed for this individual have been mitigated.
Disclosures
Learning Objectives
What is AI/Machine Learning? What are the key application in the context of medicine? What does it bring to the table in the context of Health Services and Bio-medicine? Are there new questions that we can answer? Does it suffice to draw on off-the-shelf models? What are the new/emerging ideas?
Application of AI in Biomedicine: Why We Need a “Bio”-AI.
Emerging tools for addressing Late and Missed Diagnosis in Primary Care
Why “risk factors” are often not predictive enough, and how to think about more personalized predictors of future risk of serious diseases
CARC presentation
Delving Deeper into Learning Goals
Early screening of complex diseases by leveraging deep pattern discovery in history of medical encounters
Use AI to transform the landscape of early disease diagnosis, prevention, and treatment strategies for complex medical conditions.
Realize universal primary care low-burden screening for disorders for which potentially no recommended screening tools exist currently
Generalize beyond known “risk factors”, uncover personalized predictors of future risk of serious diseases from subtle comorbidity signatures
Universality: the Need for "bio"-AI
Autism
Idiopathic Pulmonary Fibrosis
Alzheimer's Disease and related dementia
Suicidality, PTSD
Perioperative Cardiac Event
Aggressive Melanoma
Uterine Cancer
Pancreatic Cancer
...
Zero-burden EHR Analytics
Diagnostic & Screening for complex disorders
*CoR : * Comorbid Risk Scores
ACoR (Autism)
PCoR (IPF/ILD)
ZCoR (ADRD/AD)
ZCoR-C (cancers with further specialization)
Leverage Vast Patient EHR and Insurance Claims Database(s)
Truven MarketScan (IBM) Commerical Claims & Encounters Database 2003-2018
87M patients visible > 1 year
>7B individual claims
>87K unique diagnostic codes
>7% Medicare data present
Why are ML/AI models complicated, non-transparent in general?
individual data points not so much important
Tabularia in ancient Rome – storehouses of receipts from individual purchases to monitor state of commerce. (78 B.C.)
Tyco Brahe
(1546-1601)
Johannes Keplar (1571-1630)
Newtonian theory of Universal Gravitation (1684)
30,000 experiments
Starting point of modern genetics
Mendel's Laws of Genetics
Johann Gregor Mendel (1822–1884)
Some datasets are large, but simple: easily compressible or representable
Others, are not.
"big data" has irreducible complexity
Hence, "models" must have capacity to accommodate this complexity
Machine Learning and AI allows us to find "theories" which are no longer specifiable as simple equations,
but require
billions of parameters to specify
aided by AI
The Scientific Method may be updated
More importantly...
Medical history
co-morbidities
lifestyle
genetics
environment
Estimate disease risk
Estimate prognosis
Reduce missed and delayed diagnosis
Find prodromal patients for clinical trials
The Age of Data
Risk
Machine Learning is poised to transform clinical discovery and outcome research
But we are not quite there..
Autism Spectrum Disorder + AI
Idiopathic Pulmonary Fibrosis + AI
Literature Search: AI + Target Disease
Current AI Applications are limited in practice
Are ML predictions pertaining to clinical diagnoses adding anything of relevance?
Risk
The Key Stumbling Block: Features
How to find good features?
Good features
relevant risk factors
Must do pattern discovery
Discover factors that modulate risk, beyond what is already known
Must account for the possibility of non-causal spurious associations
Lesson
The need for Universal Screening
Takes too long,
not supported by insurance,
"gut feeling" / "wait & see" common
IPF diagnosed from lung imaging using CNN
Alzheimer's diagnosed from brain scan
Autism diagnosed by "AI" after 3 years
Good for writing papers, not clinically useful
1 in 59
Autism Spectrum Disorder
ASD: Ineffective screening causes delays and incurs costs
Autistic children experience higher co-morbidities
Can we exploit these patterns to predict diagnosis?
Common Knowledge: Comorbidties Exist
source: IBM Marketscan data
Autism Co-morbid Risk (ACoR) Score
Data: Onishchenko etal. Science Advances 2021
Autism Co-morbid Risk (ACoR) Score
MCHAT/F
Head to head comparison with current practice
Data: Onishchenko etal. Science Advances 2021
Autism Co-morbid Risk (ACoR) Score
Importance of different comorbidity categories
17 categories chosen:
immune | infections | endocrine | ...
Data: Onishchenko etal. Science Advances 2021
We automatically infer how different patterns depend and modulate each other to impact overall risk
Joint Operation with MCHAT
CHOP Study allows us to see effectiveness of MCHAT in different sub-populations
Modulate sensitivity/specificity trade-offs
Data: Onishchenko etal. Science Advances 2021
Rapid Universal Point-of-care Screening for ILD/IPF Using Comorbidity Signatures in Electronic Health Records
Application 2:
shortness of breath
dry cough
doctor can hear velcro crackles
Common Symptoms
>50 years old
more men than women
IPF
Rare disease
~5 in 10,000
Post-Dx
Survival
~4 years
At least one misdiagnosis
~55%
Two or more misdiagnosis
38%
Initially attributed to age related symptoms:
72%
Cannot always be seen on CXR
Non-specific symptoms
PCP workflow demands
~ 4yrs
current survival ~4yrs
~ 4yrs
current clinical DX
ZCoR screening
Onishchenko, D., Marlowe, R.J., Ngufor, C.G. et al. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat Med 28, 2107–2116 (2022). https://doi.org/10.1038/s41591-022-02010-y
n=~3M
AUC~90%
Likelihood ratio ~30
Conventional AI/ML attempts to model the physician
AI in IPF Research
Primary Care
Pulmonologist
ZCoR Flag
ICD administrative codes
IPF
ILD
target codes appear
Past medical history
No target codes appear
case
control
2yrs
2yrs
target codes appear
Past medical history
No target codes appear
case
control
2yrs
2yrs
IPF drugs prescribed
Signature of IPF diagnostic sequence
pirfenidone or nintedanib
Truven MarketScan (IBM) Commerical Claims & Encounters Database 2003-2018
>100M patients visible
>7B individual claims
>87K unique diagnostic codes
>7% Medicare data present
2,053,277 patients included in study
Univesity of Chicago Medicam Center 2012-2021
68,658 patients
Random sample from Optumlabs Data Warehouse courtsey Mayo Clinic
861,280 patients
2,983,215 patients
Data: Onishchenko etal. Nat. Medicine 2022
performance tables
Marketscan Out-of-sample Results
specificty~99%
NPV>99.9%
IPF
ILD
performance tables
UCM Out-of-sample Results
specificty~99%
NPV>99.9%
IPF
ILD
False Positives:
Ethics:
For every 20-30 flags,
1 is positive
minimal
acceptable?
Better outcomes
Collard, Harold R., Alex J. Ward, Stephan Lanes, D. Cortney Hayflinger, Daniel M. Rosenberg, and Elke Hunsche. "Burden of illness in idiopathic pulmonary fibrosis." Journal of medical economics 15, no. 5 (2012): 829-835.
Alzheimer's Disease and Related Dementia*
* in press
>5 Million in US. >13 Million in next 10 years
Alzheimer's Disease and Related Dimentia
MOCA, Blood Tests
Current Practice:
state of art with EHR:
~67% AUC*
ZCoR: ~87%
Alzheimer's Disease and Related Dimentia
state of art with EHR:
~67% AUC*
ZCoR: ~87%
Preempting ADRD accurately upto a decade in future
Applicable To Screening for Mild Cognitive Impairment
Clinical Trial Participant Selection
Current screen-failure rate: 80-90%
Estimated rate with ZCoR:
40%
Application to Suicide Attempts and Ideation (SISA) , PTSD*:
perhaps surprising connection between mood disorders and physiological comorbidities
Gibbons RD, Kupfer D, Frank E, Moore T, Beiser DG, Boudreaux ED. Development of a Computerized Adaptive Test Suicide Scale-The CAT-SS. J Clin Psychiatry. 2017 Nov/Dec;78(9):1376-1382. doi: 10.4088/JCP.16m10922. PMID: 28493655.
* in press
Application to Malignant Neoplasms*
Melanoma
Melanoma has a high survival rate of over 90% when treated early. But if it progresses to later stages, the survival rate drops significantly. Identifying potentially life-threatening melanomas is crucial.
* in press
Take Home Message,
Conclusions,
and Next Steps
Reading (References)
Onishchenko, Dmytro, Yi Huang, James van Horne, Peter J. Smith, Michael E. Msall, and Ishanu Chattopadhyay. “Reduced False Positives in Autism Screening via Digital Biomarkers Inferred from Deep Comorbidity Patterns.” Science Advances 7, no. 41 (October 8, 2021). https://doi.org/10.1126/sciadv.abf0354.
Onishchenko, Dmytro, Daniel S. Rubin, James R. van Horne, R. Parker Ward, and Ishanu Chattopadhyay. “Cardiac Comorbidity Risk Score: Zero‐Burden Machine Learning to Improve Prediction of Postoperative Major Adverse Cardiac Events in Hip and Knee Arthroplasty.” Journal of the American Heart Association 11, no. 15 (August 2, 2022). https://doi.org/10.1161/jaha.121.023745.
Onishchenko, Dmytro, Robert J. Marlowe, Che G. Ngufor, Louis J. Faust, Andrew H. Limper, Gary M. Hunninghake, Fernando J. Martinez, and Ishanu Chattopadhyay. “Screening for Idiopathic Pulmonary Fibrosis Using Comorbidity Signatures in Electronic Health Records.” Nature Medicine 28, no. 10 (September 29, 2022): 2107–16. https://doi.org/10.1038/s41591-022-02010-y.
Huang, Yi, Victor Rotaru, and Ishanu Chattopadhyay. “Sequence Likelihood Divergence for Fast Time Series Comparison.” Knowledge and Information Systems 65, no. 7 (March 16, 2023): 3079–98. https://doi.org/10.1007/s10115-023-01855-0.
Brenner, Lisa A., Lisa M. Betthauser, Molly Penzenik, Anne Germain, Jin Jun Li, Ishanu Chattopadhyay, Ellen Frank, David J. Kupfer, and Robert D. Gibbons. "Development and validation of computerized adaptive assessment tools for the measurement of posttraumatic stress disorder among US military veterans." JAMA Network Open 4, no. 7 (2021): e2115707-e2115707.