Learning to Control the Specificity in Neural Response Generation

Zhang et al., ACL 2018

Paper reading fest 20180819


  1. Problem
  2. Ideas
  3. Architecture & Training method
  4. Experiment & Result

The Problem

Two major streams of research in NLP:

  • task oriented dialog
  • general purpose dialog (eg. chit-chat)


=> generative conversational model 

Generative conversational


  • Statistical machine translation (SMT)

Conversation is continuous of utterance-response where the model tries to "translates" response for each input.

=> best case: have 1-vs-1 match for utterance-response


H: What's your name?

B: My name is B.

H: What's the weather like today?

B: I don't know.

H: Do you like her?

B: I don't know...

H: What do you know?

B: I don't kno....


Two major ways to go:

Retrieval-based: Find the best-fit response

  • Li et al. 2016a: A diversity-promoting objective function for neural conversation models.
  • Zhou et al., 2017:  Mechanism-aware
    neural machine for dialogue response generation.
  • Xing et al. 2017: Topic aware neural response generation.


=> Overlay point: Seq2Seq model, rely on preexisting responses



  • Serban et al. 2016: Building end-to-end dialogue systems using generative hierarchical neural network models.
  • Cho et al., 2014: Learning phrase representations using rnn encoder-decoder for statistical machine translation.

Paper's idea

Response Specificity

 introduce an explicit specificity control variable into a Seq2Seq model to handle different utterance-response relationships in terms of specificity.

Model's architecture


Using: GRU


denotes the semantic-based generation probability

denotes the specificity-based generation probability

Each word in dataset have: 

e: semantic representation

u: usage representation, mapped by usage embedding matrix U

Semantic-base part

semantic representation of t-1 th generated word

w is vector of the word w 

with f() is GRU unit

Specificity-based part

Using Gaussian Kernel 

u: (usage) of word using sigmod func

s:  the specificity control variable, value in [0,1]

Training method

 θ denotes all the model parameters

X,Y denotes utterance-response from training set D

s denotes the specificity control variable => need to calculate for each pair

Calculate s value

  • Normalized Inverse Response Frequency (NIRF)
  • Normalized Inverse Word Frequency (NIWF)



|R|denotes the size of the response collection

 f denote response Y corpus frequency in R

with Y is a response in response collection R



with y is a word in response Y in collection R


 f denote the number of responses in R containing the word y

so to calculate IWF of response Y

Experiment & Result

Evaluation points:

- distinct-1 & distinct-2: count numbers of distinct unigrams and bigrams in the generated responses


- BLEU point



paper reading fest

By Khanh Tran

paper reading fest

  • 1,190