hello, wor !
tushaar gangavarapu
hello, wor !
tushaar gangavarapu
hello, health!
tushaar gangavarapu
language and healthcare?

80% of medical data is unstructured!
language and healthcare?
80% of medical data is unstructured!

Mr. Tushaar is a ?-year-old male, with a history of migraines, underwent Toric Collamar surgery in 2020.
Post surgery, he developed vision with halo rings (two overlapping rings around light) and dry eyes. The patient was given Hyaluronic eye drops.
language and healthcare?

80% of medical data is unstructured!
rich patient information
Mr. Tushaar is a ?-year-old male, with a history of migraines, underwent Toric Collamar surgery in 2020.
Post surgery, he developed vision with halo rings (two overlapping rings around light) and dry eyes. The patient was given Hyaluronic eye drops.
language and healthcare?

80% of medical data is unstructured!
rich patient information
Mr. Tushaar is a ?-year-old male, with a history of migraines, underwent Toric Collamar surgery in 2020.
Post surgery, he developed vision with halo rings (two overlapping rings around light) and dry eyes. The patient was given Hyaluronic eye drops.
clinical forecasting
language and healthcare?
is healthcare setting different?
is healthcare setting different?
pains vs. aches?
is healthcare setting different?
cardiac arrest vs. heart attack?
pains vs. aches?
is healthcare setting different?
cardiac arrest vs. heart attack?
myocardial infarction, MI?
hospital in Michigan?
pains vs. aches?
hello, MIMIC-III!
hello, MIMIC-III!
hello, MIMIC-III!
nursing notes
radiology reports
rehab services
echo and ECG
discharge summaries
hello, MIMIC-III!
radiology reports
rehab services
echo and ECG
discharge summaries
nursing notes
nursing notes
nursing notes

nursing notes

acronyms (consistency)?
nursing notes
acronyms (consistency)?

duplicate notes with additions
nursing notes
acronyms (consistency)?

duplicate notes with additions

176.49 nursing notes per patient
(4,183 patients having more than 100 nursing notes, composed of over 17,890 words)
"aggregate" nursing note
"aggregate" nursing note
capturing rare terms?
bag-of-words →
capturing rare terms
bag-of-words →
word2vec (skipgram)?
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec → doc2vec
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec → doc2vec
the "essence" of the note?
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec → doc2vec
topic modeling (Dirichlet, multinomial)
+ Poisson
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec → doc2vec
topic modeling (Dirichlet, multinomial)
+ Poisson
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec → doc2vec
topic modeling (Dirichlet, multinomial)
+ Poisson
capturing rare terms
bag-of-words →
word2vec (skipgram) → sentence2vec → doc2vec

topic modeling (Dirichlet, multinomial)
+ Poisson
attention, transformer encodings, ...
ICD-9 coding
ICD-9 coding
code range | diagnosis |
001-139 | parasitic and infectious diseases |
140-239 | neoplasms |
240-279 | endocrine, immunity, metabolic, and nutritional |
280-289 | blood-forming organs and blood |
ICD-9 coding
code range | diagnosis |
001-139 | parasitic and infectious diseases |
140-239 | neoplasms |
240-279 | endocrine, immunity, metabolic, and nutritional |
280-289 | blood-forming organs and blood |
ICD-9 coding
multi-label classification
code range | diagnosis |
001-139 | parasitic and infectious diseases |
140-239 | neoplasms |
240-279 | endocrine, immunity, metabolic, and nutritional |
280-289 | blood-forming organs and blood |