Research Proposal

Anmol Goel

Deep learning in Biomedical

Using Natural Language Processing

Track 1

Training word embeddings on medical corpus + using BioBert pretrained embeddings for custom NER, sentence similarity, question answering

Track 2

Visual Question Answering on medical images.

Track 1

Embeddings

Custom Embeddings using Fasttext
BioBERT and BioELMO pretrained on PubMed articles
BioSentVec and BioWordVec recent advances

Named Entity Recognition

We can create custom dataset on text corpus using Doccano
Utilising BioBert and our custom embeddings for training BiLSTM-CNN-CRF model.
We can also try Attention models.

Sentence Similarity

Utilising the word and sentence embeddings for semantic sentence similarity task
Recent Work - BioSentVec by NCBI-NLP group

Track 2

Visual Question Answering

Using embeddings for NLP part
Using domain specific pretrained CNN weights from NiftyNet
Using Bayesian Networks for prediction

Misc

Medical expert system and knowledge graph construction
Storing in a graph database like Neo4J
Helpful in comorbidity and Symptom-Disease relationships
Finding co-occurrences of medical terms in PubMed articles
Utilising Bayesian learning for uncertainty in predictions

Proposed models and architectures

BiLSTM-CNN-CRF for NER
Bayesian networks for CNN
Attention based models for sentence similarity
Ensemble methods for higher accuracy

Datasets

https://n2c2.dbmi.hms.harvard.edu/track1 - Sentence Similarity
https://www.imageclef.org/2019/medical/vqa - VQA
MedQuAD - Text QA
https://github.com/durakkerem/Medical-Question-Answer-Datasets - Text based Question Answering

deck

By anmolg

deck

46

anmolg