Predicting Surgical Site Infections (SSI) using Electronic Medical Records
Rebecca Barter, Prabhu Shankar, Parul Dayal,
Karl Kumbier, Hien Nguyen, Bin Yu



Zimlichman et al. 2013, Merkow et al. 2015, Klevens et al. 2002, Magill et al. 2014
Surgical Site Infections
~160,000
Cases per year
>8000
Deaths associated with SSI per year
11%
ICU deaths are associated with SSI
$3.2 billion
Attributable cost per year to hospitals
11 days
Additional hospitalization for the average SSI patient
+
NHSN surveillance data
EHR data
Current approaches






Surgery
Patient
SSI
Vitals
Labs
Medications
... Our idea
The data is large and messy




Data was split across
- one file per year (2014-2017)
- multiple sheets within each excel file
for multiple types of data
- Labs
- Medications
- Previous diagnoses
- Problem list
- Vitals
- NHSN file (patient + surgery info)
Total: 26 excel files, each with multiple sheets

Number of surgeries: 37,881
Number of SSI cases: 790 (~2%)
Number of variables: 263
SSI is a rare event
Aggregated RF model





Predicting SSI

(top 15)
(top 15)
(top 15)
Performance on test set


AUC 0.79

Conclusion
Takeaways
SSI is really hard to predict.
We incorporated EHR data
Our model has an AUC of 0.79 on the test set
Aggregating many balanced models seems to work very well (at least for this problem!)
...
We will soon be testing on a cohort from the Davis VA hospital
... and would like to develop a GUI for the Davis surgeons
Lightening talk - Yu Group presentation: Predicting SSI
By Rebecca Barter
Lightening talk - Yu Group presentation: Predicting SSI
- 92