Research Proposal

Anmol Goel

Deep learning in Biomedical

Using Natural Language Processing

Track 1

Training word embeddings on medical corpus + using BioBert pretrained embeddings for custom NER, sentence similarity, question answering

Track 2

Visual Question Answering on medical images.

 

Track 1

Embeddings

  • Custom Embeddings using Fasttext
  • BioBERT and BioELMO pretrained on PubMed articles
  • BioSentVec and BioWordVec recent advances

Named Entity Recognition

  • We can create custom dataset on text corpus using Doccano
  • Utilising BioBert and our custom embeddings for training BiLSTM-CNN-CRF model.
  • We can also try Attention models.

Sentence Similarity

  • Utilising the word and sentence embeddings for semantic sentence similarity task
  • Recent Work - BioSentVec by NCBI-NLP group

Track 2

Visual Question Answering

  • Using embeddings for NLP part
  • Using domain specific pretrained CNN weights from NiftyNet
  • Using Bayesian Networks for prediction

Misc

  • Medical expert system and knowledge graph construction 
  • Storing in a graph database like Neo4J
  • Helpful in comorbidity and Symptom-Disease relationships
  • Finding co-occurrences of medical terms in PubMed articles
  • Utilising Bayesian learning for uncertainty in predictions

Proposed models and architectures

  • BiLSTM-CNN-CRF for NER
  • Bayesian networks for CNN
  • Attention based models for sentence similarity
  • Ensemble methods for higher accuracy

Datasets

  • https://n2c2.dbmi.hms.harvard.edu/track1 - Sentence Similarity
  • https://www.imageclef.org/2019/medical/vqa - VQA
  • MedQuAD - Text QA
  • https://github.com/durakkerem/Medical-Question-Answer-Datasets - Text based Question Answering

deck

By anmolg

deck

  • 46