Ishanu Chattopadhyay

Assistant Professor of

Data Science

University of Chicago

The Emerging Crystal Ball:

AI in Preemptive Medicine

&

Complex Systems

zed.uchicago.edu

  • Dr. Shahab Asoodeh

  • Dr. Yi Huang

  • Dmytro Onishenko

  • Victor Rotaru

  • Jin Li

  • Ruolin Zhang

  • David Yang

 
  • Dr. Nicholas Sizemore

  • Drew Vlasnik

  • Lucas Mantovani

  • Jaydeep Dhanoa

  • Jasmine Mithani

  • Angela Zhang

  • Warren Mo

 

zed.uchicago.edu

Department of Pediatrics

UChicago

Department of Neurology & The Memory Center

UChicago

Department of Psychiatry

UChicago

Pulmonary Critical Care, Weill Cornell

Department of Anesthesia and Critical Care

UChicago

Center for Health Statistics

UChicago

Pulmonary Critical Care, Harvard Medical School

Department of Psychiatry

UIC

Demon Network, Exeter, Alan Turing Institute, UK

Dalhousie University, Canada

Pritzker School of Molecular ENgineering

Social Science

UChicago

AI Awakens

Explains better than human students taking an introductory course

End

of

Theory?

Rapid Universal Point-of-care Screening for ILD/IPF Using Comorbidity Signatures in Electronic Health Records

shortness of breath

dry cough

doctor can hear velcro crackles

Common Symptoms

>50 years old

more men than women

IPF

Rare disease

~5 in 10,000

Post-Dx

Survival

~4 years

At least one misdiagnosis

~55%

Two or more misdiagnoses

38%

Initially attributed to age- related symptoms:

72%

Cannot always be seen on CXR

Non-specific symptoms

PCP workflow demands

Initial midiagnoses

~ 4yrs

current

post-Dx  survival ~4yrs

~ 4yrs

current clinical DX

ZCoR screening

Onishchenko, D., Marlowe, R.J., Ngufor, C.G. et al. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat Med 28, 2107–2116 (2022). https://doi.org/10.1038/s41591-022-02010-y

n=~3M

AUC~90%

Likelihood ratio ~30

Conventional AI/ML  attempts to model the physician

AI in IPF Research

  • Co-morbidity Patterns
  • No data demands
  • Use whatever data is already in patient file
  • Discover and leverage comorbidity patterns
  • No data demands
  • Use whatever data is already on patient file

Primary Care

Pulmonologist

ZCoR Flag

  • No blood tests
  • No imaging
  • No pulmonary function tests

ICD administrative codes

IPF

ILD

target codes appear

Past medical history

No target codes appear

case

control

2yrs

2yrs

1YR

IPF drugs prescribed

Signature of IPF diagnostic sequence

pirfenidone or nintedanib

  • age > 50 years
  • at least two IPF target codes identified at least 1 month apart 
  • chest CT procedure (ICD-9-CM 87.41 and Current Procedural Terminology, 4th Edition, codes 71250, 71260 and 71270) before the first diagnostic claim for IPF
  • no claims for alternative ILD codes occurring on or after the first IPF claim

target codes appear

Past medical history

No target codes appear

case

control

2yrs

2yrs

1YR

Truven MarketScan (IBM)
Commerical Claims & Encounters Database
2003-2018

>100M patients visible 

>7B individual claims

>87K unique diagnostic codes

>7% Medicare data present

2,053,277 patients included in study

University of Chicago Medical Center 
2012-2021

68,658 patients

Random sample from Optumlabs Data Warehouse courtsey Mayo Clinic

861,280 patients 

2,983,215 patients

performance tables

Marketscan Out-of-sample Results

specificity ~99%

NPV >99.9%

IPF

ILD

performance tables

UCM Out-of-sample Results

specificity ~99%

NPV >99.9%

IPF

ILD

False Positives: 

  • Heathcare Capacity

Ethics:

  • Risk from Imaging Tests

For every 20-30 flags,

1 is positive

  • General likelihood ratio 60-80
  • PPV 3.5-5%
  • Notifying patients 4 years early?
  • No cure, why screen

minimal

acceptable?

Better outcomes

Collard, Harold R., Alex J. Ward, Stephan Lanes, D. Cortney Hayflinger, Daniel M. Rosenberg, and Elke Hunsche. "Burden of illness in idiopathic pulmonary fibrosis." Journal of medical economics 15, no. 5 (2012): 829-835.

  • Early anti-fibrotic therapy seems increasingly promising
  • Better shot at lung transplant
  • Early dx reduces  hospital-izations by a factor of 1-3

Future

ZCoR 2.0

1

2

3

Deploy as an Epic App

primary care

secondary care

ZCoR

  • Patient outcomes
  • Healthcare utilization

Measure

ZCoR

clinical notes

imaging analytics

The Team

Gary Hunninghake, Pulmonary Care, Harvard Medical School

Fernando Martinez, Pulmonary Critical Care, Weill Cornell

Andrew Limper, Thoracic Research Unit, Mayo Clinic

Dmytro Onishchenko, UChicago

Robert Marlowe,

Medical Comm

Che G. Ngufor

Mayo Clinic

Louis J. Faust

Mayo Clinic

ishanu@uchicago.edu

Method Details

Longitudinal history is important, cannot simply process snapshots

* For IPF screening

*

Comparison of ZCoR with off-the-shelf AI

Leveraging Longitudinal  Patterns

Specialized HMM models from code sequences

Model control and case cohorts seprately

given a new test case, compute likelihood of sample arising from case models vs control models

sequence likelihood defect

ZeD Lab: Predictive Screening from Comorbidity Footprints

Nature Medicine

JAHA

CELL Reports

Science Adv.

ZeD Lab: Predictive Screening from Comorbidity Footprints

ZED performance Competition
Autism >80% AUC at 2 yrs Double false positives
Alzheimer's Disease ~90% AUC  60-70% AUC
Idiopathic Pulmonary Fibrosis ~90% AUC NA
MACE ~80% AUC ~70% AUC 
Bipolar Disorder ~85% AUC NA
CKD ~85% AUC NA
Cancers ~75% AUC NA

The ZeD Pipeline prototype for risk estimation from co-morbidity signatures

Primary Care

Risk

No additional tests

Clinically Useful

Advance Science

Bio-AI

Machine Learning

Information Theory

Economics

Healthcare Policy

Ethics

Comorbidities

Unknown Risk factors

Known risks

Knowledge of underlying genetic and epigenetic pathways

Clinically Useful

Advance Science

Bio-AI

Machine Learning

Information Theory

Economics

Healthcare Policy

Ethics

Comorbidities

Unknown Risk factors

Known risks

Knowledge of underlying genetic and epigenetic pathways

Unknown Risk factors

Machine Learning

Machine Learning

Features are known

Does not work for complex systems

rare/extreme events

weather

seismic phenomena

urban crime

Rotaru, V., Huang, Y., Li, T. et al. 

Event-level prediction of urban crime reveals a signature of enforcement bias in US cities. Nature Human Behavior 6, 1056–1068 (2022).

https://doi.org/10.1038/s41562-022-01372-0

Urban Crime

 

Conflicts & Terrorism

 

Extreme weather phenomena

 

Seismic events

Irreducible

Complexity

Long-range memory

 

Non-trivial stochastic effects

Predictive Policing

Broad patterns are easily predictable

The Problem of Free Will

Hotspots?

actual patterns are more non-trivial

complex urban topology

Complex Urban Topology

2 weeks, Chicago, 2017

property crime

violent crime

Not everything is predictable

Not everything is random

 

Find and use patterns that predict future events

 

Validate such patterns in out-of-sample data

Some historical patterns are predictive

Generating Event Streams

variables: <location,category>

arrests
violent crimes
nonviolent crimes

~3000 location tiles

~9000 variables

~40 million binary interactions

~ 1 billion possible models of binary interaction

ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,Beat,District,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location\\
8316800,HT550945,08/11/2011 11:00:00 AM,086XX S MARQUETTE AVE,1120,DECEPTIVE PRACTICE,FORGERY,RESIDENCE,false,false,0423,004,7,46,10,1195654,1848294,2011,02/04/2016 06:33:39 AM,41.738615478,-87.558741896\\
8316805,HT550781,10/20/2011 05:00:00 AM,056XX S ABERDEEN ST,0890,THEFT,FROM BUILDING,RESIDENCE,false,false,0712,007,16,68,06,1169943,1867457,2011,02/04/2016 06:33:39 AM,41.791797599,-87.652385205\\
8316806,HT550706,10/20/2011 05:45:00 AM,079XX S LOOMIS BLVD,031A,ROBBERY,ARMED: HANDGUN,STREET,false,false,0612,006,21,71,03,1168370,1852331,2011,02/04/2016 06:33:39 AM,41.750323974,-87.658588247\\
8316811,HT539324,10/12/2011 12:23:52 PM,003XX E 75TH ST,2027,NARCOTICS,POSS: CRACK,SMALL RETAIL STORE,true,false,0323,003,6,69,18,1179641,1855355,2011,02/04/2016 06:33:39 AM,41.758372192,-87.61719416\\
8316822,HT551031,10/19/2011 02:00:00 AM,071XX W DICKENS AVE,0910,MOTOR VEHICLE THEFT,AUTOMOBILE,SIDEWALK,false,false,2512,025,36,25,07,1127877,1913161,2011,02/04/2016 06:33:39 AM,41.918027518,-87.805606689\\
8316824,HT551032,10/20/2011 12:00:00 AM,034XX N NATCHEZ AVE,2825,OTHER OFFENSE,HARASSMENT BY TELEPHONE,RESIDENCE,false,false,1632,016,36,17,26,1132429,1922272,2011,02/04/2016 06:33:39 AM,41.94295109,-87.788669409\\
8316825,HT549690,10/19/2011 12:51:00 PM,079XX S ADA ST,2820,OTHER OFFENSE,TELEPHONE THREAT,APARTMENT,false,false,0612,006,21,71,26,1168711,1852015,2011,02/04/2016 06:33:39 AM,41.749449482,-87.657347764\\
8316826,HT549865,10/19/2011 06:00:00 AM,011XX N LEAMINGTON AVE,0810,THEFT,OVER \$500,RESIDENTIAL YARD (FRONT/BACK),false,false,1531,015,37,25,06,1141821,1907155,2011,02/04/2016 06:33:39 AM,41.9012995,-87.754523767\\
8316827,HT550963,09/01/2011 04:00:00 PM,079XX S LOOMIS BLVD,0610,BURGLARY,FORCIBLE ENTRY,RESIDENCE-GARAGE,false,false,0612,006,21,71,05,1168380,1851969,2011,02/04/2016 06:33:39 AM,41.749330381,-87.658562005\\
8316838,HT548010,10/17/2011 03:20:00 PM,055XX N KEDZIE AVE,0820,THEFT,\$500 AND UNDER,"SCHOOL, PUBLIC, GROUNDS",true,false,1712,017,40,13,06,1154047,1936545,2011,02/04/2016 06:33:39 AM,41.981712678,-87.708829703\\
8316839,HT551049,10/20/2011 08:50:00 AM,102XX S AVENUE N,0430,BATTERY,AGGRAVATED: OTHER DANG WEAPON,STREET,false,false,0432,004,10,52,04B,1201166,1837763,2011,02/04/2016 06:33:39 AM,41.709579698,-87.538903651\\
8316871,HT549680,10/19/2011 01:03:00 PM,044XX N BROADWAY,0460,BATTERY,SIMPLE,DEPARTMENT STORE,false,false,2313,019,46,3,08B,1168460,1929880,2011,02/04/2016 06:33:39 AM,41.963123126,-87.65601675\\
8316872,HT551071,10/19/2011 03:10:00 PM,053XX S CALUMET AVE,0810,THEFT,OVER \$500,RESIDENCE,false,false,0234,002,3,40,06,1179390,1869685,2011,02/04/2016 06:33:39 AM,41.797700881,-87.617676981\\
8316873,HT551063,10/20/2011 11:10:00 AM,003XX E 47TH ST,1811,NARCOTICS,POSS: CANNABIS 30GMS OR LESS,SIDEWALK,true,false,0222,002,3,38,18,1178980,1873925,2011,02/04/2016 06:33:39 AM,41.809345175,-87.619051287\\
8316874,HT550901,10/20/2011 09:11:00 AM,033XX W OGDEN AVE,2022,NARCOTICS,POSS: COCAINE,POLICE FACILITY/VEH PARKING LOT,true,false,1024,010,24,29,18,1154489,1891024,2011,02/04/2016 06:33:39 AM,41.856790413,-87.708424071\\
8316875,HT549739,10/19/2011 01:30:00 PM,002XX E GARFIELD BLVD,0820,THEFT,\$500 AND UNDER,CTA BUS,false,false,0232,002,3,40,06,1178645,1868596,2011,02/04/2016 06:33:39 AM,41.794729551,-87.620442108\\
8316880,HT549802,10/19/2011 12:00:00 PM,011XX W WILSON AVE,0460,BATTERY,SIMPLE,COLLEGE/UNIVERSITY GROUNDS,false,false,2311,019,46,3,08B,1167612,1930696,2011,02/04/2016 06:33:39 AM,41.96538061,-87.659110921\\
8316881,HT431449,08/04/2011 11:00:00 AM,027XX W CHICAGO AVE,0820,THEFT,\$500 AND UNDER,STREET,false,false,1313,012,26,24,06,1157782,1905211,2011,02/04/2016 06:33:39 AM,41.895654523,-87.69595021\\
8316882,HT549162,10/19/2011 06:42:00 AM,105XX S WESTERN AVE,0610,BURGLARY,FORCIBLE ENTRY,TAVERN/LIQUOR STORE,false,false,2211,022,19,72,05,1162255,1834721,2011,02/04/2016 06:33:39 AM,41.702128701,-87.681485145\\
8316884,HT544972,10/16/2011 04:30:00 AM,103XX S HALSTED ST,1310,CRIMINAL DAMAGE,TO PROPERTY,SMALL RETAIL STORE,false,false,2232,022,34,73,14,1172807,1836192,2011,02/04/2016 06:33:39 AM,41.705939683,-87.642803521\\
8316886,HT549777,10/19/2011 02:10:00 PM,014XX W PRATT BLVD,0850,THEFT,ATTEMPT THEFT,SMALL RETAIL STORE,false,false,2431,024,49,1,06,1165565,1945281,2011,02/04/2016 06:33:39 AM,42.005446228,-87.666219555\\
8316887,HT551046,10/20/2011 11:10:00 AM,050XX N WINTHROP AVE,2820,OTHER OFFENSE,TELEPHONE THREAT,RESIDENCE,false,false,2033,020,48,3,26,1167955,1933907,2011,02/04/2016 06:33:39 AM,41.974184283,-87.657756697\\
8316889,HT550997,10/20/2011 09:10:00 AM,041XX N DICKINSON AVE,1121,DECEPTIVE PRACTICE,COUNTERFEITING DOCUMENT,STREET,true,false,1624,016,45,15,10,1142513,1926900,2011,02/04/2016 06:33:39 AM,41.955468935,-87.751489799\\
8316890,HT532649,10/07/2011 11:46:00 PM,062XX S VERNON AVE,2092,NARCOTICS,SOLICIT NARCOTICS ON PUBLICWAY,SIDEWALK,true,false,0313,003,20,42,26,1180324,1863694,2011,02/04/2016 06:33:39 AM,41.781239632,-87.614435596\\
8316893,HT551023,10/20/2011 02:00:00 AM,081XX S STEWART AVE,0810,THEFT,OVER \$500,STREET,false,false,0622,006,21,44,06,1175042,1850896,2011,02/04/2016 06:33:39 AM,41.746239973,-87.634181801\\
8316894,HT550772,10/20/2011 07:10:00 AM,048XX N TALMAN AVE,1320,CRIMINAL DAMAGE,TO VEHICLE,STREET,false,false,2031,020,40,4,14,1157831,1931974,2011,02/04/2016 06:33:39 AM,41.969093078,-87.695038453\\
8316898,HT551055,10/16/2011 09:00:00 AM,018XX S LAFLIN ST,1365,CRIMINAL TRESPASS,TO RESIDENCE,APARTMENT,false,false,1222,012,25,31,26,1166665,1891065,2011,02/04/2016 06:33:39 AM,41.856651049,-87.663730374\\
8316899,HT550695,10/20/2011 05:30:00 AM,122XX S HALSTED ST,1310,CRIMINAL DAMAGE,TO PROPERTY,RESIDENCE PORCH/HALLWAY,false,false,0524,005,34,53,14,1173210,1823636,2011,02/04/2016 06:33:39 AM,41.671475169,-87.641696924\\
8316901,HT549052,10/19/2011 12:01:00 AM,064XX S DR MARTIN LUTHER KING JR DR,1811,NARCOTICS,POSS: CANNABIS 30GMS OR LESS,STREET,true,false,0312,003,20,42,18,1180027,1862432,2011,02/04/2016 06:33:39 AM,41.777783389,-87.615563072\\
8316902,HT550988,10/20/2011 10:25:00 AM,066XX S KENNETH AVE,0486,BATTERY,DOMESTIC BATTERY SIMPLE,RESIDENCE,false,false,0833,008,13,65,08B,1147854,1860166,2011,02/04/2016 06:33:39 AM,41.772241474,-87.733568892\\
8316909,HT550957,10/20/2011 04:45:00 AM,051XX S MONITOR AVE,2825,OTHER OFFENSE,HARASSMENT BY TELEPHONE,RESIDENCE,false,true,0811,008,23,56,26,1138234,1869952,2011,02/04/2016 06:33:39 AM,41.79927468,-87.768598031\\

 

Input: Event Log (What happened, When and Where)

No manual selection of features!

  • No manual selection of factors
  • No creation of "lists"
  • Uses only de-identified data

The Problem:

  • Predicting crime sufficiently ahead of time to be actionable
  • Prediction precise enough in time and space to be actionable
  • Use ONLY data that is realistically and cheaply available

1 Week in advance

Within ~2 city blocks

ONLY Past eventlog as input

Prediction Performance

"crime forecast"

93% accuracy

87% AUC

~70% specificity at ~80% sensitivity

10 actual crimes:

11 predicted:

8 correct:

2 missed:

3 false alarms

Chicago Predictive Performance

The Underlying Math

Not based on standard "Deep Learning"

  • Forecasting rare events in  multi-variable stochastic evolution requires new modeling architecture​
  • Learn local "activation functions" as symbolic probabilistic transducers
  • Assemble these local predictors into a "fractal net"

Applies to any rare/extreme event phenomena

Ishanu Chattopadhyay, Yi Huang, James Evans et al. Deep Learning Without Neural Networks: Fractal-nets for Rare Event Modeling, 26 October 2020, PREPRINT (Version https://doi.org/10.21203/rs.3.rs-86045/v1

Digital Twin

Not just a predictor

Focusing on dynamics of observables

\mathcal{E}(x,y,t) \in \{ \mathcal{C}_{NV},\mathcal{C}_V,\mathcal{A},\varnothing \}
\mathcal{E}: \mathcal{S}\times T \times \mathcal{F} \rightarrow \mathbb{E}

unmodeled factors

\mathcal{E}(x,y,t) = f(\mathcal{E}(x_1,y_1,t_1),\cdots,\mathcal{E}(x_n,y_n,t_n))\\ \textrm{where } t_i \leqq t

Observable future is a function of the observable past

Why no "features" ? 

Philadelphia

Predicting crime sufficiently ahead of time to be actionable

  • Prediction precise enough in time and space to be actionable
  • Use ONLY data that is realistically and cheaply available

>3 days in advance

Within ~2 city blocks

ONLY Past eventlog as input

Mean AUC

Property crime: 81%

Violent crime:    84%

Spatial tiles:

0.003 deg latitude, 0.003 deg longitude

0.25 miles across

Time-period:

Training:                      Jan 1 2016 - Dec 31 2018

Out-of-sample test:  Jan 1 2019 - April 1 2019

Prediction Performance (Philadelphia)

sensitivity   0.90
ppv           0.87

100 crimes

Raise 103 flags

90 correct flags

13 false positives

10 missed

3 day ahead prediction

Jan 1 2019

to

April 1 2019

Play Movie

Triangles: actual events

 

heatmap: predicted risk 3 days ahead

Could we have predicted this?

 

Double homicide

Jan 7 2019

Triple homicide incident

Jan 7 2019

https://www.inquirer.com/crime/kensington-triple-shooting-homicide-philadelphia-police-20190107.html

Triangles: actual events

 

heatmap: predicted risk 3 days ahead

Wauwatosa, WI

2019, Jan - 2022, Mar: Training

2022, Apr - 2022 Sep: Testing

 

Property and Violent Crimes

Every 10 events, about 8 flags are raised, with almost no false alarms

Rates of different crimes

 Spatial Resolution

(~1000 yards)

Average sensitivity: 0.85

 

Average positive predictive value: 1.0

 

Horizon: 7 days +/- 1 Day

 

142 spatial tiles (~1000 yards tile size)

Chihuahua, Mexico

2020 - 2022, May: Training

2022, June - 2022 Dec: Testing

 

Property and Violent Crimes

Rates of different crimes

Out-of-sample Prediction

Average sensitivity: 0.83

 

Average positive predictive value: 0.98

 

Horizon: 7 days +/- 1 Day

 

220 spatial tiles

Boston, MA

 

Training:                      Jan 1 2021 - March 31 2022

Out-of-sample test:  Apr 1 2022 - July 22 2022

Spatial tiles:

0.0028 deg latitude, 0.0019 deg longitude

0.2 miles across

(100 x 300 yds)

Boston Districts: B2 B3 C1

Jan 1 2021 -  July 22 2022

Total # of events: 7419

Boston MA, USA

fp       1.840708
tp      21.530973
fn       2.946903
sens     0.878993
ppv      0.909565
fp      0.106195
tp      3.982301
fn      0.522124
sens    0.881521
ppv     0.960446

property

violent

Property:

Other_Larceny-Larceny_from_MV-Auto_Theft-Residential_Burglary-Robbery-Commercial_Burglary    

Violent:

Aggravated_Assault-Rape_&_Attempted-Homicide                                                 

Mean AUC

Property crime: 81%

Violent crime:    84%

Predicting

extreme weather

Predicting

extreme weather

Outperforms pure physics-based models at longer horizons

No point learning individual sample paths

Can we learn stochastic phenomena non-parametrically?

Can we learn stochastic phenomena non-parametrically?

?

\sigma_0:0
\sigma_1:1

Can be more complicated...

A lot more complicated...

Algorithm genESeSS

Chattopadhyay, Ishanu, and Hod Lipson. "Abductive learning of quantized stochastic processes with probabilistic finite automata." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371, no. 1984 (2013): 20110543.

What do the states mean?

State structure :

self-similarity in dynamical systems

Deep Learning without Neural Networks

Fractal Net: Information gradient doe not decay from the computation

\gamma^A_{A_\delta} = 1 - \frac{\mathbb{E}_{x \in \Sigma^\star}h(\phi_x^{\mathcal{H}_A,\mathcal{H}_{A_\delta}})}{h(\phi_\lambda^{\mathcal{H}_A,\mathcal{H}_{A_\delta}})}
a^{l}_j = \sigma\left( \sum_k w^{l}_{jk} a^{l-1}_k + b^l_j \right)

Fractal Net

Neural Net

fixed non-linear activation

The Fractal Net architecture

No back-propagation

Deep Learning without Neural Networks

Physics Introduction as structural and other constraints

Nearly

Hands-free

No feature-engineering

Next:

Model and predict world events

Predicting World Events

Temporal resolution: 1 day

 

Spatial Resolution: \(1^\circ \times 2^\circ\)

Global Terrorism DataBase

Where do we go next?

How can this model be extened?

Extreme Event Prediction

Simulate

the world

The Risk of Unchecked State Control and Abrogation of Individual Liberties

Perhaps it is really not a question about AI, but about clarity on the principles by which we wish to govern ourselves

AI

Seminar_IIT

By Ishanu Chattopadhyay

Seminar_IIT

Predictive modeling of crime and rare phenomena using fractal nets

  • 162