Deep Learning for Natural Language Processing
Mitesh M. Khapra
Assistant Professor, Department of Computer Science & Engineering
Indian Institute of Technology Madras
ASP Interview (01-Mar-2021)
Professional Background
B.E.
M.Tech
Researcher
PhD Thesis: Reusing Resources for Multilingual Computation
Machine Translation
Debater
Multimodal Embeddings
Knowledge Base QA
Multimodal Chatbots
Ph.D.
2002
2008
2012
Assistant Professor
2012 - 2016
2016 - 2021
E
ગુ
हि
ಕ
म
ਪੰ
த
తె
Ph.D.
M.S.
Current
Make India AI ready
A* (conference)
A (conference)
A* (journal)
16
21
3
Sep'16 Feb'21
13
24
*
*
*
*
*
*
+
+
+
+
h-index
NLP Research@RBC (Three Main Themes/Contributions)
Indic NLP
E
हि
Code-mixed Chatbots
Indic NLU Benchmark
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
Multilingual Embeddings
Tools for under represented languages
বা
ગુ
हि
ಕ
म
ਪੰ
த
తె
ଓ
മ
অ
வீடு
घर
Suman Banerjee, Mitesh M. Khapra. Graph Convolutional Network with Sequential Attention for Goal-oriented Dialogue Systems. Transactions of the Association for Computational Linguistics (TACL), 2019
Interpretable NLP
?
input
output
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
मैं
यहां
हूं
Interpretrable Attention Networks
Post-hoc Statistical Inference of Attention
Reject \(H_0\)
I
am
here
+
I
am
here
मैं
यहां
हूं
Wow!?
Preksha Nema, Mitesh M. Khapra, Anirban Laha, B. Ravindran, Diversity Driven Attention Model for Query Based Abstractive Summarisation, The 55th Annual Meeting of the Association of Computational Linguistics (ACL 2017), 2017
मैं
यहां
I
am
I
am
here
here
I
am
here
I
am
here
मैं
यहां
NLP Research@RBC (Three Main Themes/Contributions)
Evaluating NLG
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
Taxonomy of evaluation metrics
How are you?
I am solid
I am liquid
Robust evaluation metrics for dialogs
'02
#metrics
'14
'21
3
14
57
Ref: the boy went home
Predicted: the boy went to his house
?
Task-aware evaluation metrics
director of Titanic?
Who is the actor of Leonardo?
Ananya B. Sai, M Akash Kumar, Siddhartha Arora, Mitesh M. Khapra. Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining. Transactions of the Association for Computational Linguistics (TACL), 2020.
NLP Research@RBC (Three Main Themes/Contributions)
Teaching@IITM
Fundamentals of Deep Learning
Topics in Deep Learning
Introduction to Programming
Introduction to Machine Learning
void fun()
{
int i = 0;
i++;
fun();
}
void main ()
{
fun();
}
Linear Algebra & Random Processes
rank=m < n
\begin{bmatrix}
~~&~&~~&~~&\\
~~&~&~~&~~&\\
~~&~&~~&~~&\\
\end{bmatrix}
\underbrace{~~~~~~~~~~~~~~~~~~}
\underbrace{~~~~~~~}
Pivots
Free
Object Oriented Algorithms Implementation and Analysis
Course TCF
Instructor TCF
0.94
0.91
0.91
0.92
0.95
0.97
0.86
0.89
0.91
0.96
0.81
0.75
Projects@IITM
Knowledge Graph Driven Multimodal Conversation Systems
In which years was the per capita income in Delhi greater than that in Chennai?
Impact: Largest publicly available dataset for reasoning over scientific plots, state of the art model for extracting visual objects in scientific plots
USD 50K
USD 10K
AI for All
AAAI 2021, WACV 2020
USD 35K
RL for NLG
USD 15K
Indic QA
हि
Awards
Young Faculty Recognition Award (2019)
Prof. B. Yegnanarayana Award for Excellence in Research and Teaching (2020)
Google Faculty Research Award (2018)
This award was instituted by Google to recognize and support world-class faculty pursuing cutting-edge research in areas of mutual interest.
This is an award instituted by an alumnus Dr.P.Balasubramanian (1971/ BT/ AE,1973/ MT/ IM) for recognizing young faculty who have done well in research and have been good teachers in the courses.
The purpose of the award is to recognize regular CSE faculty members of IIT Madras who excel in research and teaching.
Technical Talk
Preksha Nema, Mitesh M. Khapra, Anirban Laha, B. Ravindran, Diversity Driven Attention Model for Query Based Abstractive Summarisation, The 55th Annual Meeting of the Association of Computational Linguistics (ACL 2017), 2017
Task: Query Based Extractive Summarisation
Roger Federer wins a record eighth men’s singles title at Wimbledon on Sunday. He defeated Marin Cilic in straight sets with 6-3, 6-1, 6-4. Cilic appeared to struggle with a foot injury but the Swiss was in imperious form on Centre Court, winning the final in one hour and 41 minutes. It is Federer’s 19th grand slam title and his second of 2017 following victory at the Australian Open in January.
Document
Query
Federer won record eighth Wimbledon title beating Marin Cilic in straight sets.
What happened in the finals at Wimbledon?
Summary
Existing Models
What
happened
...
Wimbledon
+
...
+
Roger
Federer
...
...
January
+
<Start>
Federer
Query Encoder
Document Encoder
Attention Network
Federer
won
straight
won
... ...
sets
Decoder
...
...
...
...
...
The problem
What
happened
...
Wimbledon
+
...
+
Roger
Federer
...
...
January
+
<Start>
Federer
Query Encoder
Document Encoder
Attention Network
Federer
won
straight
Federer
won in straight
sets won sets
Decoder
...
...
...
...
...
Repeating phrases in the output
Our hypothesis
Federer
Query Encoder
Document Encoder
Attention Network
Federer
Decoder
...
...
t=1
t=2
\(c_1\)
\(c_2\)
May be the context vectors at the two time steps (\(c_t, c_{t+1}\)) are very similar
Our solution(s)
D1: Orthogonalize the context vectors
\(c_{t-1}\)
\(c_t\)
\(c'_t\)
Issue: What about the past history?
D2: Maintain History
\(c_t\)
\(d_{t-1}\)
\(d'_t\)
Issue: What if we want phrases to repeat?
He kept talking and talking and talking!
\(c_1\)
\(c_2\)
\(c_3\)
\(c_{t-1}\)
\(d_{t-1}\)
\(d_t\)
\(d_{t}\)
Federer won in straight sets in sets
Our solution(s)
D3: Soft orthogonalisation
\(d_{t-1}\)
\(d_t\)
\(d'_t\)
Introduce a parameter \(\gamma\) which controls what fraction of the component along \(c_{t-1}\) should be subtracted from \(c_t\)
He kept talking and talking and talking!
Results: (Quantitative)
What percentage of the reference summary is includes in the predicted summary
Models | Rouge-1 | Rouge-2 | Rouge |
---|
\overbrace{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}
Encode-Attend-Decode | 13.73 | 2.06 | 12.84 |
---|---|---|---|
SOTA [Chen et.al. 2016] | 33.06 | 13.25 | 32.17 |
D1 (orthogonalize) | 33.85 | 13.65 | 32.99 |
---|
D2 (history) | 38.12 | 16.76 | 37.31 |
---|
D3 (soft) | 41.26 | 18.75 | 40.43 |
---|
Baselines
\begin{cases}
\\
\end{cases}
\begin{cases}
\\
\\
\end{cases}
Proposed Solutions
Results: (Qualitative)
Fuel cell critics point out that hydrogen is flammable, but so is gasoline. Unlike gasoline, which can pool up and burn for a long time, hydrogen dissipates rapidly. Gas tanks tend to be easily punctured, thin-walled containers, while the latest hydrogen tanks are made from Kevlar. Also, gaseous hydrogen isn’t the only method of storage under consideration – BMW is looking at liquid storage while other researchers are looking at chemical compound storage, such as boron pellets.
Query
Are hydrogen fuel cell vehicles safe?
Document
Reference
Hydrogen in cars is less dangerous than gasoline
Baseline
Hydrogen is hydrogen hydrogen hydrogen fuel energy
D1
D3
Hydrogen in cars is reduce risk than fuel
Hydrogen in cars is less dangerous than gasoline
Results: (Qualitative)
The basis of all animal rights should be the Golden Rule: we should treat them as we would wish them to treat us, were any other species in our dominant position.
Query
Do animals have rights that makes eating them inappropriate?
Document
Reference
Animals should be treated as we would want to be treated
Baseline
Animals should be treated as we would protect to be treated
D1
D2
Animals should be treated as we most individual to be treated
Animals should be treated as those want to be treated
Summary
Encode
Attend
Decode
...
...
Refine
Multiple follow-up works
Genesis of a Ph.D. thesis
Future Research Plan
Indic NLP
Interpretable NLP
?
input
output
Evaluating NLG
Full NLP stack for Indian languages
Input tools
Basic Building Blocks
Generation
Reasoning
Interpreting Multilingual Mutimodal Models
வீடு
घर
Adversarial Evaluation of Evaluation models
How are you?
I am liquid
Thank You!
Research@IITM (Three Main Themes/Contributions)
Indic NLP
Interpretable NLP
Evaluating NLG
?
input
output
E
हि
Code-mixed Chatbots
Indic NLU Benchmark
NLP: Natural Language Processing
NLU: Natural Language Understanding
NLG: Natural Language Generation
Multilingual Embeddings
Tools for less represented languages
मैं
यहां
हूं
Interpretrable Attention Networks
Post-hoc Inference of Attention
Reject \(H_0\)
বা
ગુ
हि
ಕ
म
ਪੰ
த
తె
ଓ
മ
অ
வீடு
घर
I
am
here
+
मैं
यहां
I
am
I
am
here
I
am
here
I
am
here
मैं
यहां
I
am
here
मैं
यहां
हूं
Wow!?
Taxonomy of evaluation metrics
How are you?
I am solid
I am liquid
Robust evaluation metrics
'02
#metrics
'14
'21
3
14
57
Why this Work?
Preksha Nema, Mitesh M. Khapra, Anirban Laha, B. Ravindran, Diversity Driven Attention Model for Query Based Abstractive Summarisation, The 55th Annual Meeting of the Association of Computational Linguistics (ACL 2017), 2017
Emotional: First paper at IITM with my first PhD student
IJCAI 2018, NAACL 2018, ACL 2020
Impactful:
Teaching@IITM
Fundamentals of Deep Learning
Topics in Deep Learning
Introduction to Programming
Introduction to Machine Learning
void fun()
{
int i = 0;
i++;
fun();
}
void main ()
{
fun();
}
Linear Algebra & Random Processes
rank=m < n
\begin{bmatrix}
~~&~&~~&~~&\\
~~&~&~~&~~&\\
~~&~&~~&~~&\\
\end{bmatrix}
\underbrace{~~~~~~~~~~~~~~~~~~}
\underbrace{~~~~~~~}
Pivots
Free
Object Oriented Algorithms Implementation and Analysis
Young Faculty Recognition Award
Prof. B. Yegnanarayana Award for Excellence in Research and Teaching
Core
Co-taught
en-fr
en-de
4M
# sentences
existing sources
MR || sources
non-MR || sources
monolingual sources
en-hi
en-bn
10M
en-ta
MR || sources
PIB\(_{Anuvaad}\) - PIB\(_{IIITH} \)
Wikipedia
non-MR || sources
Constitution
TN Assembly
AP Assembly
Monolingual sources
IndicCorp
4 new crawls
ASP Interview - IIT Madras
By Mitesh Khapra
ASP Interview - IIT Madras
- 571