Zhang et al., ACL 2018
Paper reading fest 20180819
Two major streams of research in NLP:
=> generative conversational model
Approach
Conversation is continuous of utterance-response where the model tries to "translates" response for each input.
=> best case: have 1-vs-1 match for utterance-response
H: What's your name?
B: My name is B.
H: What's the weather like today?
B: I don't know.
H: Do you like her?
B: I don't know...
H: What do you know?
B: I don't kno....
Two major ways to go:
Retrieval-based: Find the best-fit response
=> Overlay point: Seq2Seq model, rely on preexisting responses
Generation-based:
Response Specificity
introduce an explicit specificity control variable into a Seq2Seq model to handle different utterance-response relationships in terms of specificity.
Using: GRU
denotes the semantic-based generation probability
denotes the specificity-based generation probability
Each word in dataset have:
e: semantic representation
u: usage representation, mapped by usage embedding matrix U
semantic representation of t-1 th generated word
w is vector of the word w
with f() is GRU unit
Using Gaussian Kernel
u: (usage) of word using sigmod func
s: the specificity control variable, value in [0,1]
θ denotes all the model parameters
X,Y denotes utterance-response from training set D
s denotes the specificity control variable => need to calculate for each pair
where:
|R|denotes the size of the response collection
f denote response Y corpus frequency in R
with Y is a response in response collection R
Normalization:
with y is a word in response Y in collection R
where:
f denote the number of responses in R containing the word y
so to calculate IWF of response Y
Evaluation points:
- distinct-1 & distinct-2: count numbers of distinct unigrams and bigrams in the generated responses
- BLEU point