Deep Learning in NLP
- Powered by Python
data:image/s3,"s3://crabby-images/743ed/743edb9fc3cf536aff43f5e1ff704f3990552fa1" alt=""
About me
- Technical Lead and Chief Deep Learning Engineer at Neuron
- Google Summer of Code Intern'14
What is...
- Machine Learning?
- Natural Language Processing?
- Neural Networks?
- Deep Learning?
Common NLP Tasks
- Language Modelling
- Sentiment Analysis
- Named Entity Recognition
- Topic Modelling
- Semantic Proximity
- Text Summarization
- Machine Translation
- Speech Recognition
The Old Way of doing things...
- Bag of words
- This movie wasn't particularly funny or entertaining -> [movie, funny, entertaining]
- n-grams(next slide)
- Regex Patterns, etc.
n-grams example
data:image/s3,"s3://crabby-images/136d9/136d92b78d91732a69b2a509b6e7da48f71819a7" alt=""
Suggested Reading:
n-grams/naive bayes
Pros and Cons of Bag-of-words and
n-gram models
- easy to build(hypothesize and code)
- require relatively lesser data to train
- faster to train
- give no heed to the order of the words
- doesn't care about the structure of the natural language
- usually assumes that a word is conditioned upon just last few words
- can't handle long term dependencies in text
Neural Networks
data:image/s3,"s3://crabby-images/601bc/601bc45a9cc2b396985d3ded0cc75bac169148e8" alt=""
...Neural Networks
a beautiful, highly flexible and generic biologically-inspired programming paradigm which enables a computer to learn from observational data
Backpropagation
data:image/s3,"s3://crabby-images/d6000/d60001c3d9634aa786cd71ae5fc5f7964dd6bede" alt=""
Suggested Reading:
Neural Networks
Word Vector Embeddings
data:image/s3,"s3://crabby-images/73449/734496685633e39ec14958c6c6454012ebb0c87e" alt=""
Vector Space Models
- Creates a d-dimensional space, where each word is represented by a point in this space
- All the words with a very high co-occurrence will be clustered together
- Understands semantic relations between words
- Each dimension explains one of the characteristics of the natural language like plural, past tense, opposite gender, etc.
Introduction to Gensim
- A Python based library that provides modules for word2vec, LDA, doc2vec, etc.
- Options for training word vectors using skip gram and CBOW
- In built scripts to train word vectors over wikipedia dumps
...Gensim
model = Word2Vec(sentences, size=100, window=5, min_count=5, workers=4)
model.save(fname)
model = Word2Vec.load(fname) # you can continue training with the loaded model!
model.most_similar(positive=['woman', 'king'], negative=['man'])
==> [('queen', 0.50882536), ...]
model.doesnt_match("breakfast cereal dinner lunch".split())
==> 'cereal'
model.similarity('woman', 'man')
==> 0.73723527
model['computer'] # raw numpy vector of a word
==> array([-0.00449447, -0.00310097, 0.02421786, ...], dtype=float32)
...Gensim
data:image/s3,"s3://crabby-images/5b8cd/5b8cdc6e4ae09c2e261517bf8c807627fe09f1bb" alt=""
Suggested Reading:
Word Vectors
Introduction to Theano
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.
- Built on top of numpy
- Symbolic Expressions
- Automatic Differentiation
- In built integration for GPU Computation
- Python Interface
- Shared Variables
Theano Syntax
import numpy
import theano
import theano.tensor as T
from theano import pp
x = T.dscalar('x')
y = x ** 2
gy = T.grad(y, x)
f = theano.function([x], gy)
f(4)
# array(8.0)
Suggested Reading:
Theano
Deep Neural Networks
data:image/s3,"s3://crabby-images/7d8fc/7d8fcf60ef51a260e91593058335c217fb4c67bb" alt=""
Suggested Reading:
Deep Learning Models
Recurrent Neural Networks
data:image/s3,"s3://crabby-images/34fef/34fef793158087e654499cf4401a9ccbe6863cf7" alt=""
data:image/s3,"s3://crabby-images/fec13/fec135796aef91025b790a45e8f5d0233ba3c6c7" alt=""
...RNNs
- particularly useful for sequential data(text, audio, etc.)
- "Recurrent" because they performs the same operations on each element of the input
- considers the dependencies between the input elements(words), contrary to bag-of-words model
- common RNNs are 5-1000 layer deep
Language Modelling
data:image/s3,"s3://crabby-images/65cba/65cbad390d7219e0b3f24414aed4c705aceb8200" alt=""
Probability of a sentence consisting of 'm' words
=
Multiplication of Joint probability of each word conditioned upon (all) previous words
LM with RNN
data:image/s3,"s3://crabby-images/9e298/9e298bb5e40b67ef306547d478e5a391972b0ad9" alt=""
Back Propagation in Time
data:image/s3,"s3://crabby-images/e0687/e068753f258938c163929a88d0cbe2cd5d12a425" alt=""
data:image/s3,"s3://crabby-images/3a02e/3a02e798034ee3f4e915590e2017a3075b805921" alt=""
Vanishing Gradient Problem
data:image/s3,"s3://crabby-images/fac1f/fac1fa259296bb4db1033b9791e4c0197942b082" alt=""
data:image/s3,"s3://crabby-images/01018/01018f3631a07dba6d99965356f6e537e8f81a4a" alt=""
Long-Short Term Memory
data:image/s3,"s3://crabby-images/c21bc/c21bc1e99ac76584d9d33fbdbea40d02963d1f81" alt=""
LSTM Eqns
Suggested Reading:
RNN
Email: rishy.s13@gmail.com
Github: https://github.com/rishy
Linkedin: https://www.linkedin.com/in/rishabhshukla1
This is It.
Deep Learning in NLP
By Rishabh Shukla
Deep Learning in NLP
Deep Learning in NLP - Powered by Python
- 4,458