Towards a General Theory of Digital Twins In Medicine and Social Modeling
Ishanu Chattopadhyay, PhD
Assistant Professor of Medicine
University of Kentucky
ishanu@uchicago.edu
first wave
rule-based systems
second wave
Big Data / ML / Deep Learning
recognize patterns, make predictions, might improve over time, but struggle on tasks not trained for
third wave
contextual reasoning, generalizable, towards true intelligence
PhD
Postdoc
ZeDLAB
Engineering
Computer Sc
Medicine
Career Trajectory
The Laboratory for Zero Knowledge Discovery
AI/ML learning theory and applications
Implication of AI in Future of Societay
Complex systems
Social interactions & opinion dynamics
Personalized medicine
collaborators
Alex Leow
Psychiatry UIC
Anna Podolanczuk, Pulmonary Care, Weill Cornell
Gary Hunninghake, Pulmonary C, Harvard
Robert Gibbons, Bio-statistics
Daniel Rubins, Anesthesia and Critical Care
Peter Smith, Pediatrics
Michael Msall Pediatrics
Fernando Martinez, Pulmonary Critical Care, Weill Cornell
James Mastrianni, Neurology
James Evans, sociology
Erika Claud, Pediatrics
Aaron Esser-Kahn Molecular Engineering
David Llewellyn
University of Exeter
Kenneth Rockwood
Dalhousie University
Andrew Limper Mayo Clinic
Department of Pediatrics
UChicago
Department of Neurology & The Memory Center
UChicago
Department of Psychiatry
UChicago
Pulmonary Critical Care, Weill Cornell
Department of Anesthesia and Critical Care
UChicago
Center for Health Statistics
UChicago
Pulmonary Critical Care, Harvard Medical School
Department of Psychiatry
UIC
Demon Network, Exeter, Alan Turing Institute, UK
Dalhousie University, Canada
Pritzker School of Molecular ENgineering
Social Science
UChicago
Our Collaborations
D3M (I2O)
PAI (DSO)
PREEMPT (BTO)
YFA (DSO)
NIA
$
Predictive Modeling of Complex Systems
~3.5M USD in 5 years
Publications
&
Impact
Nature Medicine
Nature Human Behavior
Nature Commun-ication
Science Advances
(3)
PNAS
JAMA
JAHA
JACC
Modeling & predicting complex social interactions
Point-of-care screening for complex diseases
Ai
Electronic Healthcare Record
IPF
ASD
ADRD
ZeD Research Thrusts
General framework for inferring digital twins in biology and medicine
Hint. Probably not what classical Engineering and Design Industry meant in the 2000s.
Old Digital Twins:
The first use of the term "digital twin" is generally attributed to Dr. Michael Grieves in a 2002 presentation on product lifecycle management (PLM) at the University of Michigan.
Dr. Grieves discussed the idea of having a virtual representation of a physical product, which would exist throughout the product's lifecycle. This digital model would be used to simulate, predict, and optimize the product's performance, both during design and after it was built. The digital twin would be continuously updated with data from the physical product, enabling real-time analysis and decision-making.
Connected body of models, equations, physics at multiple scales, with observational data to inform states, useful over entire life-cycle of the system
Digital Twin: Generative AI for Complex Systems
"Physics" is unknown/emergent.
Data: multi-modal, disparate data-type, disparate scales, noisy, incomplete, often un-labeled
ZCoR Suite:
Disease-specific Digital Twin
~ 4yrs
current survival ~4yrs
~ 4yrs
current clinical DX
ZCoR screening
Onishchenko, D., Marlowe, R.J., Ngufor, C.G. et al. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat Med 28, 2107–2116 (2022). https://doi.org/10.1038/s41591-022-02010-y
n=~3M
AUC~90%
Likelihood ratio ~30
Data: Onishchenko etal. Nat. Medicine 2022
patient A
patient B
patient C
Beyond "risk factors" to personalized risk patterns
Upto 4 year "signal" resolution
decreases risk
increases risk
Patient Journey: Tracking Risk over time
>5 Million in US. >13 Million in next 10 years
Alzheimer's Disease and Related Dimentia
MOCA, Blood Tests
Current Practice:
state of art with EHR:
~67% AUC*
ZCoR: ~87%
Alzheimer's Disease and Related Dimentia
state of art with EHR:
~67% AUC*
ZCoR: ~87%
Preempting ADRD accurately upto a decade in future
Autism
MCHAT/F
1 in 59
36
ZeD Lab: Predictive Screening from Comorbidity Footprints
CELL Reports
ZCoR | Competition | |
---|---|---|
Autism | >83% | "obvious" |
Alzheimer's Disease | ~90% | 60-70% |
Idiopathic Pulmonary Fibrosis | ~90% | NA |
MACE | ~80% | ~70% |
Bipolar Disorder | ~85% | NA |
CKD | ~85% | NA |
Rare Cancers (Bladder, Uterus) | ~75-80% | Low |
Suicidality (with CAT-SS) | 98% PPV | Low |
Off-the-shelf AI does not suffice
Odds ratios combined via ML
1
Data
cases
control
odds ratios for all ICD codes
ML Model
odds-based risk estimator
Probabilistic Finite State
Map health history to trinary streams
Chattopadhyay, Ishanu, and Hod Lipson. "Abductive learning of quantized stochastic processes with probabilistic finite automata." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371, no. 1984 (2013): 20110543.
2
Longitudinal stochastic patterns
PFSAs
from code sequences
Model control and case cohorts seprately
given a new test case, compute likelihood of sample arising from case models vs control models
sequence likelihood defect
Huang, Yi, Victor Rotaru, and Ishanu Chattopadhyay. "Sequence likelihood divergence for fast time series comparison." Knowledge and Information Systems 65, no. 7 (2023): 3079-3098.
Cloud Deployment
Theoretical formulation
Multi-cohort validation
Launch User-Accessible Platform
3 years
2 years
[
{
"patient_id": "P000038",
"sex": "F",
"birth_date": "01-01-2006",
"DX_record": [
{"date": "07-31-2006", "code": "Z38.00"},
{"date": "08-07-2006", "code": "P59.9"},
{"date": "08-29-2016", "code": "J01.90"},
{"date": "09-10-2016", "code": "J01.90"},
{"date": "11-14-2016", "code": "J01.91"}
],
"RX_record": [
{"date": "10-29-2011", "code": "rxLDA017"},
{"date": "05-16-2015", "code": "rxIDG004"},
{"date": "08-08-2015", "code": "rxIDG004"},
{"date": "06-04-2016", "code": "rxIDD013"}
],
"PROC_record": [
{"date": "02-05-2007", "code": "90723"},
{"date": "11-05-2007", "code": "J1100"}
]
}
]
{
"predictions": [
{
"error_code": "",
"patient_id": "P000012",
"predicted_risk": 0.005794344620009157,
"probability": 0.8253881317184486
}
],
"target": "TARGET"
}
Data In
Data Out
Cohort Selection and Risk Analysis Testbed
Misleading Diagnosis of Idiopathic Pulmonary Fibrosis: A Clinical Concern
Javier Ramos-Rossy, MD, Onix Cantres-Fonseca, MD, Ginger Arzon-Nieves, Yomayra Otero-Dominguez, MD, Stella Baez-Corujo, MD, and William Rodríguez-Cintrón, MD
General Digital Twins
General framework for inferring digital twins in biology and medicine
Stamping Out the Next Pandemic **Before** The First Human Infection
BioNorad
Q-Net
recursive forest
q-distance
a biologically informed, adaptive distance between strains
Smaller distances imply a quantitatively high probability of spontaneous jump
$$J \textrm{ is the Jensen-Shannon divergence }$$
Metric Structure
Tangent Bundle
geometry
dynamics
Influenza Risk Assessment Tool (IRAT) scoring for animal strains
slow (months), quasi-subjective, expensive
*https://www.cdc.gov/flu/pandemic-resources/monitoring/irat-virus-summaries.htm
24 scores in 14 years
~10,000 strains collected annually
CDC
Emergenet time: 1 second
THE PROBLEM
Assuming a 1000 species ecosystem, and 1 successful experiment every day to discern a single two-way relationship, we would need 1,368 years to go through all possibilities.
Digital Twin for the Maturing Human Microbiome
Boston U
U Chicago
Two centers
Ability to "fill in" missing data is equivalent to making trajectory forecasts
predicting neurodevelopmental deficits
forecasting ecosystem trajectories
Which entities are most predictive
of neurodevelopmental deficit
entity X timestamp
SHAP value
No transplantation is guaranteed to work reliably
Just add those microbes back to reduce risk?
No!
Bacterial transplantation must be personalized
Future task:
Explicit supplantation profiles that are tuned to individual ecosystems
Problem: Can AI predict how we think and interact?
Can we predict how opinions evolve?
Digital Twins for complex systems
YFA 2020
Can an AI tell if you are lying?
Can an AI tell how you are going to vote?
Yang, David, James EVans, and Ishanu Chattopadhyay. "‘Its the Economy Stupid’: Predictive Theory of Belief Shift Connecting Economic Stress to Societal Polarization." (2023).
Emergent Recursive Forest in GSS
Modeling Responses to PTSD Evaluation
The Cognet Framework
Digital Twin of Opinion dynamics
predict worldviews from incomplete data
Identify malingering in psychiatric diagnoses
GSS variable | actual (masked) | Reconstructed |
---|---|---|
spkcom | allowed | allowed |
colcom | not fired | not fired |
spkmil | allowed | allowed |
colmil | allowed | not allowed |
libmil | not remove | not remove |
libhomo | not remove | not remove |
reliten | strong | no religion |
pray | once a day | once a day |
bible | inspired word | word of god |
abhlth | yes | yes |
abpoor | no | no |
pillok | agree | agree |
intmil | very interested | very interested |
abpoorw | always wrong | not wrong at all |
godchnge | believe now, always have | believe now, always have |
prayfreq | several times a week | several times a week |
religcon | strong disagree | disagree |
religint | disagree | disagree |
comfort | strongly agree | neither agree nor disagree |
Reconstruction
Example 1
GSS variable | actual (masked) | Reconstructed |
---|---|---|
spkcom | allowed | allowed |
colcom | not fired | not fired |
libmil | not remove | not remove |
libhomo | not remove | not remove |
gunlaw | favor | favor |
reliten | no religion | no religion |
prayer | approve | approve |
bible | book of fables | inspired word |
abnomore | yes | yes |
abhlth | yes | yes |
abpoor | yes | yes |
abany | yes | yes |
owngun | no | no |
intmil | moderately interested | moderately interested |
abpoorw | not wrong at all | not wrong at all |
godchnge | believe now, didn't used to | believe now, always have |
prayfreq | several times a week | several times a week |
religcon | strongly agree | agree |
religint | strongly agree | not agree/dsagre |
Reconstruction
Example 2
GSS Variable | actual (masked) | reconstructed |
---|---|---|
spkcom | allowed | allowed |
colcom | not fired | not fired |
libcom | not remove | not remove |
libmil | not remove | not remove |
libhomo | not remove | not remove |
libmslm | not remove | not remove |
gunlaw | favor | favor |
reliten | not very strong | strong |
pray | once a week | several times a day |
bible | inspired word | word of god |
abdefect | yes | yes |
abhlth | yes | yes |
abrape | yes | yes |
pillok | strongly agree | agree |
shotgun | no | no |
abpoorw | not wrong at all | not wrong at all |
godchnge | don't believe now, used to | believe now, always have |
religcon | disagree | agree |
comfort | strongly agree | agree |
Reconstruction
Example 3
Digital Twins for complex systems
Darkome
teomims
opinion dynamics
algorithmic lie detector
Mental health diagnosis
viral emergence
microbiome
Phase 1
Phase 2
PREPARE: Pioneering Research for Early Prediction of Alzheimer's and Related Dementias EUREKA Challenge
Algorithm for early diagnosis
Find Data for early prediction
Phase 1
Phase 2
Second Prize 40,000 USD
Lets give them:
licensed patient data
digital twin
(generative AI)
teomims
(open cohort)
Phase 1
Phase 2
Uncorrelated, yet indistinguishable !!
VeRITaAS
Can A Generative AI Tell if you Are Lying?
Vetting Response Integrity from
cross-Talk in Adversarial
Surveys
Q-Net
Hidden structure of cross-talk between responses to interview items
PTSD diagnostic interview
Number of possible responses
Minimum Performance (n=624)
Average Time: 3.5 min
No. of questions: 20
AUC > 0.95
PPV > 0.86
NPV > 0.92
At least 83.3% sensitivity at 94% specificity
Minimum AUC = \(0.95 \pm 0.005\)
Cannot be coached, or memorized
Datasets for training & validation
1. VA (n=294)
2. Prolific (n=300)
3. Psychiatrists (n=30)
Beat the test!
200 participants in
US
100 participants in
UK
30 forensic psychiatrists
10
6
1
Can-You-Fake-PTSD Challenge Results
successful attempts
Large Science Models (LSM) and Conservation of Complexity
Large Science Models (LSMs)
Expanding the Scientific Method:
Nicholas Sizemore et al. ,A digital twin of the infant microbiome to predict neurodevelopmental deficits.Sci. Adv.10,eadj0400(2024).DOI:10.1126/sciadv.adj0400
E=MC\(^2\)
Complex systems have irreducible complexity.
Generative models of complex systems must have complex structure, which can be only recovered vi AI-leveraged methods
Model complexity
data to model uncertainty
A Kolmogorov twin \(S\) for data \(x\) is a model that is 1) typical, 2) optimal and is of maximal complexity.
theorem
corollary
Impact on Popular Discourse on AI
Media Coverage
In
National Pop-culture Discourse
Interviews, Op-eds, and Forum Appearences
Rotaru, Victor, Yi Huang, Timmy Li, James Evans, and Ishanu Chattopadhyay. "Event-level prediction of urban crime reveals a signature of enforcement bias in US cities." Nature human behaviour 6, no. 8 (2022): 1056-1068.