Looking Beyond Risk Factors:

Generative Bio-AI for Proactive Point-of-Care Early Diagnosis and Reduced Screen Failures in ILD and PF

Ishanu Chattopadhyay, PhD

Assistant Professor of Medicine

University of Chicago

ishanu@uchicago.edu

ishanu@uchicago.edu

ishanu@paraknowledge.ai

University of Chicago Medicine

The Laboratory for Zero Knowledge Discovery

mathematics

computer science

social science

medicine

D3M (I2O)

PAI (DSO)

PREEMPT (BTO)

YFA (DSO)

FUNDING

Prognosis at Point-of-Diagnosis 

  • Optimizing Management

Patient Journey 

  • Continuous Risk Monitoring

Early Diagnosis

  • Universal Screening
  • Cohort Selection

Reduce screen failure rates

Holistic health surveillance

Predict antifibrotics continuation

improve outcomes

1

2

3

Interstitial Lung Disease / Pulmonary Fibrosis

Rapid Universal Point-of-care Screening for ILD/IPF Using Comorbidity Signatures in Electronic Health Records

Flag patients before they (or doctors) suspect 

Primary Care

Pulmonologist

Zero-burden Co-morbid Risk Score (ZCoR)

Referral

shortness of breath

dry cough

doctor can hear velcro crackles

Non-specific Symptoms

>50 years old

more men than women

IPF

Rare disease

~5 in 10,000

Post-Dx

Survival

~4 years

Cannot always be seen on CXR

At least one misdiagnosis

~55%

Two or more misdiagnosis

38%

Initially attributed to age related symptoms:

72%

PCP workflow demands

Known Co-morbidities of PF

Are there more? Subtle footprints in the medical history that are more heterogeneous? 

~ 4yrs

current  survival ~4yrs

~ 4yrs

current clinical DX

ZCoR screening

Onishchenko, D., Marlowe, R.J., Ngufor, C.G. et al. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat Med 28, 2107–2116 (2022). https://doi.org/10.1038/s41591-022-02010-y

n=~3M

AUC~90%

Likelihood ratio ~30

Conventional AI/ML  attempts to model the physician

AI in IPF Research

  • Co-morbidity patterns
  • No data demands
  • Use whatever data is already on patient file

ICD administrative codes

IPF

ILD

target codes appear

Past medical history

No target codes appear

case

control

2yrs

2yrs

prediction

target codes appear

Past medical history

No target codes appear

case

control

2yrs

2yrs

IPF drugs prescribed

Signature of IPF diagnostic sequence

pirfenidone or nintedanib

  • age > 50 years
  • at least two IPF target codes identified at least 1 month apart 
  • chest CT procedure (ICD-9-CM 87.41 and Current Procedural Terminology, 4th Edition, codes 71250, 71260 and 71270) before the first diagnostic claim for IPF
  • no claims for alternative ILD codes occurring on or after the first IPF claim

ICD Codes can be noisy

"cases" are not always true IPF

Truven MarketScan (IBM)
Commerical Claims & Encounters Database
2003-2018

>100M patients visible 

>7B individual claims

>87K unique diagnostic codes

>7% Medicare data present

2,053,277 patients included in study

University of Chicago Medical Center 
2012-2021

68,658 patients

Random sample from Optumlabs Data Warehouse courtsey Mayo Clinic

861,280 patients 

2,983,215 patients

Data: Onishchenko etal. Nat. Medicine 2022

very likelihood ratios achieved irrespective of subgroup

performance tables

Out-of-sample Results

specificity ~99%

NPV >99.9%

IPF

ILD

Comorbidity Spectra

patient A

patient B

patient C

Beyond "risk factors" to personalized risk patterns

False Positives: 

  • Heathcare Capacity

Ethics:

  • Risk from Imaging Tests

For every 20-30 flags,

1 is positive

  • General likelihood ratio 60-80
  • PPV 3.5-5%
  • Notifying patients 4 years early?
  • No cure, why screen

minimal

acceptable?

Better outcomes

  • early anti-fibrotic therapy seems increasingly promising
  • better shot at lung transplant
  • early dx reduces  hospital-izations by a factor of 1-3

Collard, Harold R., Alex J. Ward, Stephan Lanes, D. Cortney Hayflinger, Daniel M. Rosenberg, and Elke Hunsche. "Burden of illness in idiopathic pulmonary fibrosis." Journal of medical economics 15, no. 5 (2012): 829-835.

Clinical Trial Cohort Selection

Current screen failure rate ~50-60%

ZCoR boosted screen failure rate ~20%

cohort size: 2000

initial cohort size: 10000

initial cohort size with ZCoR: 3000

Cost per patient for confirmatory tests: ~7k USD

Savings: 30-50M USD

Cloud Deployment

Theoretical formulation

Multi-cohort validation

Launch User-Accessible Platform

3 years

2 years

[
    {
        "patient_id": "P000038",
        "sex": "F",
        "birth_date": "01-01-2006",
        "DX_record": [
            {"date": "07-31-2006", "code": "Z38.00"},
            {"date": "08-07-2006", "code": "P59.9"},
            {"date": "08-29-2016", "code": "J01.90"},
            {"date": "09-10-2016", "code": "J01.90"},
            {"date": "11-14-2016", "code": "J01.91"}
        ],
        "RX_record": [
            {"date": "10-29-2011", "code": "rxLDA017"},
            {"date": "05-16-2015", "code": "rxIDG004"},
            {"date": "08-08-2015", "code": "rxIDG004"},
            {"date": "06-04-2016", "code": "rxIDD013"}
        ],
        "PROC_record": [
            {"date": "02-05-2007", "code": "90723"},
            {"date": "11-05-2007", "code": "J1100"}
        ]
    }
]
{
  "predictions": [
    {
      "error_code": "",
      "patient_id": "P000012",
      "predicted_risk": 0.005794344620009157,
      "probability": 0.8253881317184486
    }
  ],
  "target": "TARGET"
}

Data In

Data Out

The Paraknowledge API

curl -X POST -H "Content-Type: application/json" -d '[{"patient_id": "P28109965201", "sex": "M", "age": 89, "fips": "35644", "DX_record": [{"date": "12-16-2011", "code": "R09.02"}, {"date": "12-30-2011", "code": "H04.129"}, {"date": "12-30-2011", "code": "H02.109"}], "RX_record": [], "PROC_record": [{"date": "09-28-2012", "code": "71100"}]}]' "https://us-central1-pkcsaas-01.cloudfunctions.net/zcor_predict?target=IPF&api_key=7eea9f70d79c408f2b69847d911303c"

Current Targets

IPF
ILD
ADRD
CKD
CKD_SEVERE
MELANOMA
CANCER_PANCREAS
CANCER_UTERUS
SISA

Cohort Selection and Risk Analysis Testbed

Misleading Diagnosis of Idiopathic Pulmonary Fibrosis: A Clinical Concern
Javier Ramos-Rossy, MD, Onix Cantres-Fonseca, MD, Ginger Arzon-Nieves, Yomayra Otero-Dominguez, MD, Stella Baez-Corujo, MD, and William Rodríguez-Cintrón, MD

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6248220/

Upto 4 year "signal" resolution

decreases risk

increases risk

Patient Journey: Tracking Risk over time

Risk decreases sometimes

new codes change trajectory as they are revealed

Off-the-shelf AI does not suffice

Modeling Longitudinal  Patterns

Specialized HMM models from code sequences

Model control and case cohorts seprately

given a new test case, compute likelihood of sample arising from case models vs control models

sequence likelihood defect

Huang, Yi, Victor Rotaru, and Ishanu Chattopadhyay. "Sequence likelihood divergence for fast time series comparison." Knowledge and Information Systems 65, no. 7 (2023): 3079-3098.

ZeD Lab: Predictive Screening from Comorbidity Footprints

Nature Medicine

JAHA

CELL Reports

Science Adv.

The ZCoR Approch: Rapidly Re-targettable

ZED performance Competition
Autism >80% AUC at 2 yrs "obvious"
Alzheimer's Disease ~90% AUC  60-70% AUC
Idiopathic Pulmonary Fibrosis ~90% AUC NA
MACE ~80% AUC ~70% AUC 
Bipolar Disorder ~85% AUC NA
CKD ~85% AUC NA
Cancers (Prostate, Bladder, Uterus, Skin) ~75-80% AUC Low

Deploy all/many/most of these!

Predictions at the Point-of-Diagnosis

Can my patient continue taking anti-fibrotics over long term?

Digital Twins for Health trajectories

}

\rho_1
\rho_2
\rho_i
\rho_m

1M parameters

1M parameters

Predicts disorders across the disease specturm

Pre-empting Effectiveness of Antifibrotics at the point of diagnosis

~78% AUC

26-32 out of 100 discontinued 

4-5 out of 100 discontinued

Prognosis at Point-of-Diagnosis 

  • Optimizing Management

Patient Journey 

  • Continuous Risk Monitoring

Early Diagnosis

  • Universal Screening
  • Cohort Selection

Reduce screen failure rates

Holistic health surveillance

Predict antifibrotics continuation

improve outcomes

Summary

3

2

1

ishanu@uchicago.edu

@ishanu_ch

Amgen

By Ishanu Chattopadhyay

Amgen

Novel AI and Digital Twins to Predict, Screen and Accelerate Understanding of ILD and PF

  • 139