Predicting Surgical Site Infections (SSI) using Electronic Medical Records

 

 

Rebecca Barter, Prabhu Shankar, Parul Dayal,

Karl Kumbier, Hien Nguyen, Bin Yu

 

Zimlichman et al. 2013, Merkow et al. 2015, Klevens et al. 2002, Magill et al. 2014

Surgical Site Infections

~160,000

Cases per year

>8000

Deaths associated with SSI per year

11%

ICU deaths are associated with SSI

$3.2 billion

Attributable cost per year to hospitals

11 days

Additional hospitalization for the average SSI patient

+

NHSN surveillance data

EHR data

Current approaches

Surgery

Patient

SSI

Vitals

Labs

Medications

... Our idea

The data is large                    and messy

Data was split across 

  • one file per year (2014-2017)
  • multiple sheets within each excel file 

for multiple types of data

  • Labs
  • Medications
  • Previous diagnoses
  • Problem list
  • Vitals
  • NHSN file (patient + surgery info)

Total: 26 excel files, each with multiple sheets

Number of surgeries: 37,881 

Number of SSI cases: 790 (~2%)

Number of variables: 263

SSI is a rare event

Aggregated RF model

Predicting SSI

(top 15)

(top 15)

(top 15)

Performance on test set

AUC 0.79

Conclusion

Takeaways

SSI is really hard to predict.

 

We incorporated EHR data

 

Our model has an AUC of 0.79 on the test set

 

Aggregating many balanced models seems to work very well (at least for this problem!)

 

...

 

We will soon be testing on a cohort from the Davis VA hospital

... and would like to develop a GUI for the Davis surgeons 

Lightening talk - Yu Group presentation: Predicting SSI

By Rebecca Barter

Lightening talk - Yu Group presentation: Predicting SSI

  • 92