Deep Learning for Natural Language Processing
Mitesh M. Khapra
Assistant Professor, Department of Computer Science & Engineering
Indian Institute of Technology Madras
ASP Interview (01-Mar-2021)
Professional Background
B.E.
M.Tech
Researcher


PhD Thesis: Reusing Resources for Multilingual Computation






Machine Translation
Debater
Multimodal Embeddings
Knowledge Base QA

Multimodal Chatbots




Ph.D.

2002
2008
2012

Assistant Professor
2012 - 2016

2016 - 2021
E
ગુ
हि
ಕ
म
ਪੰ
த
తె











Ph.D.
M.S.
Current




Make India AI ready










A* (conference)


A (conference)
A* (journal)
16
21
3
Sep'16 Feb'21
13
24
*
*
*
*
*
*
+
+
+
+




h-index

NLP Research@RBC (Three Main Themes/Contributions)
Indic NLP


E
हि
Code-mixed Chatbots
Indic NLU Benchmark
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
Multilingual Embeddings

Tools for under represented languages

বা
ગુ
हि
ಕ
म
ਪੰ
த
తె
ଓ
മ
অ
வீடு
घर
Suman Banerjee, Mitesh M. Khapra. Graph Convolutional Network with Sequential Attention for Goal-oriented Dialogue Systems. Transactions of the Association for Computational Linguistics (TACL), 2019


Interpretable NLP

?
input
output
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
मैं
यहां
हूं
Interpretrable Attention Networks
Post-hoc Statistical Inference of Attention

Reject H0
I
am
here
+
I
am
here
मैं
यहां
हूं
Wow!?
Preksha Nema, Mitesh M. Khapra, Anirban Laha, B. Ravindran, Diversity Driven Attention Model for Query Based Abstractive Summarisation, The 55th Annual Meeting of the Association of Computational Linguistics (ACL 2017), 2017

मैं
यहां
I
am
I
am
here
here
I
am
here
I
am
here
मैं
यहां
NLP Research@RBC (Three Main Themes/Contributions)
Evaluating NLG
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation

Taxonomy of evaluation metrics
How are you?

I am solid
I am liquid


Robust evaluation metrics for dialogs
'02
#metrics
'14
'21
3
14
57
Ref: the boy went home
Predicted: the boy went to his house
?
Task-aware evaluation metrics
director of Titanic?
Who is the actor of Leonardo?
Ananya B. Sai, M Akash Kumar, Siddhartha Arora, Mitesh M. Khapra. Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining. Transactions of the Association for Computational Linguistics (TACL), 2020.
NLP Research@RBC (Three Main Themes/Contributions)
Teaching@IITM
Fundamentals of Deep Learning
Topics in Deep Learning
Introduction to Programming
Introduction to Machine Learning


void fun()
{
int i = 0;
i++;
fun();
}
void main ()
{
fun();
}
Linear Algebra & Random Processes
rank=m<n
rank=m < n
\begin{bmatrix}
~~&~&~~&~~&\\
~~&~&~~&~~&\\
~~&~&~~&~~&\\
\end{bmatrix}
\underbrace{~~~~~~~~~~~~~~~~~~}
\underbrace{~~~~~~~}
Pivots
Free

Object Oriented Algorithms Implementation and Analysis



Course TCF
Instructor TCF
0.94
0.91
0.91
0.92
0.95
0.97
0.86
0.89
0.91
0.96
0.81
0.75
Projects@IITM

Knowledge Graph Driven Multimodal Conversation Systems

In which years was the per capita income in Delhi greater than that in Chennai?
Impact: Largest publicly available dataset for reasoning over scientific plots, state of the art model for extracting visual objects in scientific plots
USD 50K

USD 10K

AI for All
AAAI 2021, WACV 2020
USD 35K
RL for NLG


USD 15K
Indic QA
हि


Awards
Young Faculty Recognition Award (2019)
Prof. B. Yegnanarayana Award for Excellence in Research and Teaching (2020)

Google Faculty Research Award (2018)
This award was instituted by Google to recognize and support world-class faculty pursuing cutting-edge research in areas of mutual interest.
This is an award instituted by an alumnus Dr.P.Balasubramanian (1971/ BT/ AE,1973/ MT/ IM) for recognizing young faculty who have done well in research and have been good teachers in the courses.
The purpose of the award is to recognize regular CSE faculty members of IIT Madras who excel in research and teaching.
Technical Talk
Preksha Nema, Mitesh M. Khapra, Anirban Laha, B. Ravindran, Diversity Driven Attention Model for Query Based Abstractive Summarisation, The 55th Annual Meeting of the Association of Computational Linguistics (ACL 2017), 2017
Task: Query Based Extractive Summarisation
Roger Federer wins a record eighth men’s singles title at Wimbledon on Sunday. He defeated Marin Cilic in straight sets with 6-3, 6-1, 6-4. Cilic appeared to struggle with a foot injury but the Swiss was in imperious form on Centre Court, winning the final in one hour and 41 minutes. It is Federer’s 19th grand slam title and his second of 2017 following victory at the Australian Open in January.
Document
Query
Federer won record eighth Wimbledon title beating Marin Cilic in straight sets.
What happened in the finals at Wimbledon?
Summary
Existing Models
What
happened
...
Wimbledon
+
...
+
Roger
Federer
...
...
January
+
<Start>
Federer
Query Encoder
Document Encoder
Attention Network
Federer
won
straight
won
... ...
sets
Decoder
...
...
...
...
...
The problem
What
happened
...
Wimbledon
+
...
+
Roger
Federer
...
...
January
+
<Start>
Federer
Query Encoder
Document Encoder
Attention Network
Federer
won
straight
Federer
won in straight
sets won sets
Decoder
...
...
...
...
...
Repeating phrases in the output
Our hypothesis
Federer
Query Encoder
Document Encoder
Attention Network
Federer
Decoder
...
...
t=1
t=2
c1
c2
May be the context vectors at the two time steps (ct,ct+1) are very similar
Our solution(s)

D1: Orthogonalize the context vectors
ct−1
ct
ct′
Issue: What about the past history?
D2: Maintain History
ct
dt−1
dt′
Issue: What if we want phrases to repeat?
He kept talking and talking and talking!
c1
c2
c3
ct−1
dt−1
dt
dt
Federer won in straight sets in sets
Our solution(s)

D3: Soft orthogonalisation
dt−1
dt
dt′
Introduce a parameter γ which controls what fraction of the component along ct−1 should be subtracted from ct
He kept talking and talking and talking!
Results: (Quantitative)
What percentage of the reference summary is includes in the predicted summary
Models | Rouge-1 | Rouge-2 | Rouge |
---|
\overbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}
Encode-Attend-Decode | 13.73 | 2.06 | 12.84 |
---|---|---|---|
SOTA [Chen et.al. 2016] | 33.06 | 13.25 | 32.17 |
D1 (orthogonalize) | 33.85 | 13.65 | 32.99 |
---|
D2 (history) | 38.12 | 16.76 | 37.31 |
---|
D3 (soft) | 41.26 | 18.75 | 40.43 |
---|
Baselines
{
\begin{cases}
\\
\end{cases}
{
\begin{cases}
\\
\\
\end{cases}
Proposed Solutions
Results: (Qualitative)
Fuel cell critics point out that hydrogen is flammable, but so is gasoline. Unlike gasoline, which can pool up and burn for a long time, hydrogen dissipates rapidly. Gas tanks tend to be easily punctured, thin-walled containers, while the latest hydrogen tanks are made from Kevlar. Also, gaseous hydrogen isn’t the only method of storage under consideration – BMW is looking at liquid storage while other researchers are looking at chemical compound storage, such as boron pellets.
Query
Are hydrogen fuel cell vehicles safe?
Document
Reference
Hydrogen in cars is less dangerous than gasoline
Baseline
Hydrogen is hydrogen hydrogen hydrogen fuel energy
D1
D3
Hydrogen in cars is reduce risk than fuel
Hydrogen in cars is less dangerous than gasoline
Results: (Qualitative)
The basis of all animal rights should be the Golden Rule: we should treat them as we would wish them to treat us, were any other species in our dominant position.
Query
Do animals have rights that makes eating them inappropriate?
Document
Reference
Animals should be treated as we would want to be treated
Baseline
Animals should be treated as we would protect to be treated
D1
D2
Animals should be treated as we most individual to be treated
Animals should be treated as those want to be treated
Summary
Encode
Attend
Decode
...
...
Refine
Multiple follow-up works
Genesis of a Ph.D. thesis


Future Research Plan
Indic NLP

Interpretable NLP

?
input
output
Evaluating NLG

Full NLP stack for Indian languages







Input tools

Basic Building Blocks
Generation

Reasoning

Interpreting Multilingual Mutimodal Models


வீடு
घर
Adversarial Evaluation of Evaluation models
How are you?
I am liquid





Thank You!
Research@IITM (Three Main Themes/Contributions)
Indic NLP

Interpretable NLP
Evaluating NLG

?
input
output

E
हि
Code-mixed Chatbots
Indic NLU Benchmark
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
Multilingual Embeddings

Tools for less represented languages
मैं
यहां
हूं
Interpretrable Attention Networks
Post-hoc Inference of Attention

Reject H0

বা
ગુ
हि
ಕ
म
ਪੰ
த
తె
ଓ
മ
অ
வீடு
घर
I
am
here
+
मैं
यहां
I
am
I
am
here
I
am
here
I
am
here
मैं
यहां
I
am
here
मैं
यहां
हूं
Wow!?

Taxonomy of evaluation metrics
How are you?

I am solid
I am liquid


Robust evaluation metrics
'02
#metrics
'14
'21
3
14
57
Why this Work?
Preksha Nema, Mitesh M. Khapra, Anirban Laha, B. Ravindran, Diversity Driven Attention Model for Query Based Abstractive Summarisation, The 55th Annual Meeting of the Association of Computational Linguistics (ACL 2017), 2017

Emotional: First paper at IITM with my first PhD student
IJCAI 2018, NAACL 2018, ACL 2020
Impactful:
Teaching@IITM
Fundamentals of Deep Learning
Topics in Deep Learning
Introduction to Programming
Introduction to Machine Learning


void fun()
{
int i = 0;
i++;
fun();
}
void main ()
{
fun();
}
Linear Algebra & Random Processes
rank=m<n
rank=m < n
\begin{bmatrix}
~~&~&~~&~~&\\
~~&~&~~&~~&\\
~~&~&~~&~~&\\
\end{bmatrix}
\underbrace{~~~~~~~~~~~~~~~~~~}
\underbrace{~~~~~~~}
Pivots
Free

Object Oriented Algorithms Implementation and Analysis



Young Faculty Recognition Award
Prof. B. Yegnanarayana Award for Excellence in Research and Teaching

Core
Co-taught
en-fr
en-de
4M
# sentences
existing sources
MR || sources
non-MR || sources
monolingual sources
en-hi
en-bn
10M
en-ta
MR || sources
PIBAnuvaad - PIBIIITH
Wikipedia
non-MR || sources
Constitution
TN Assembly
AP Assembly
Monolingual sources
IndicCorp
4 new crawls
ASP Interview - IIT Madras
By Mitesh Khapra
ASP Interview - IIT Madras
- 664