Practical NLP Applications of Deep Learning

What is Deep Learning?

Turtles all the way down...

So what's the big deal?

MASSIVE improvements in Computer Vision

Speech Recognition

  • Baidu (with Andrew Ng as their chief) has built a state-of-the-art speech recognition system with Deep Learning
  • Their dataset: 7000 hours of conversation couple with background noise synthesis for a total of 100,000 hours
  • They processed this through a massive GPU cluster

Cross Domain Representations

  • What if you wanted to take an image and generate a description of it?
  • The beauty of representation learning is it's ability to be distributed across tasks
  • This is the real power of Neural Networks

But Samiur, what about NLP?

Deep Learning NLP

  • Distributed word representations
  • Dependency Parsing
  • Sentiment Analysis
  • And many others ...


  • Bag of Words
    • A one-hot encoding
    • 20k to 50k dimensions
    • Can be improved by factoring in document frequency


Word embedding

  • Neural Word embeddings
    • Uses a vector space that attempts to predict a word given a context window
    • 200-400 dimensions

motel [0.06, -0.01, 0.13, 0.07, -0.06, -0.04, 0, -0.04]

hotel [0.07, -0.03, 0.07, 0.06, -0.06, -0.03, 0.01, -0.05]

Word Representations

Word embeddings make semantic similarity and synonyms possible

Word embeddings have cool properties:

Dependency Parsing

Converting sentences to a dependency based grammar

Simplifying this to the verbs and it's agents is called Semantic Role Labeling

Sentiment Analysis

  • Recursive Neural Networks
    • Can model tree structures very well
    • This makes it great for other NLP tasks too (such as parsing)

Get to the applications part already!


Problem: Funding Sentence Classifier

Build a binary classifier that is able to take any sentence from a news article and tell if it's about funding or not.


eg. "Mattermark is today announcing that it has raised a round of $6.5 million"

Word Vectors

  • Used Gensim's Word2Vec implementation to train unsupervised word vectors on the UMBC Webbase Corpus (~100M documents, ~48GB of text)
  • Then, iterated 20 times on text in news articles in the tech news domain (~3M documents, ~900MB of text)

Sentence Vectors

  • How can you compose word vectors to make sentence vectors?
    • Use paragraph vector model proposed by Quoc Le
    • Feed into an RNN constructed by a dependency tree of the sentence proposed by Richard Socher
    • Convolution Neural Networks proposed by Yoon Kim
    • Use heuristic function to combine the string of word vectors

What did we try?

  • TF-IDF + Naive Bayes
  • Word2Vec + Composition Methods
  • Word2Vec + TF-IDF + Composition Methods
  • Word2Vec + TF-IDF + Semantic Role Labeling (SRL) + Composition Methods

Composition Methods

Where wrepresents the i'th word vector,

wv  the word vector for the verb, and a0 and a1 are agents



What worked?

  • Word2Vec + TFIDF + SRL + Circular Convolution
    • The first method with simple TFIDF/Naive Bayes performed extremely poorly because of it's large dimensionality
    • Combining TFIDF with Word2Vec provided a small, but noticeable improvement
    • Adding SRL and a more sophisticated composition method increased performance by almost 5%

What else is possible?

  • Can we apply this method to generate general purpose document vectors?
    • We are currently using LDA (a topic analysis method) or simple TFIDF  to create document vectors
    • How will this method compare to the already proposed paragraph vector method by Quoc Le?

Document Vectors

Document Search

  • Can we associate these document vectors with much smaller query strings?
    • eg. Search for artificial intelligence against our companies and get better results than keyword search

Mattermark is Hiring!

Contact me at:

  • @samiur1204

Practical NLP Applications of Deep Learning

By Samiur Rahman

Practical NLP Applications of Deep Learning

Deep Learning is the hot “new” technique in the world of Machine Learning, but most of the published benefits of Deep Learning has been tied to audio and visual data. There are, however, significant benefits users can draw from Deep Learning, particularly in the area of unsupervised representation learning. This talk focuses on the practical applications of these techniques, particularly neural network word embeddings. I also explore how Mattermark uses these techniques to perform many ML and NLP tasks.

  • 6,436