Paper Report

陳家陞 Oct. 8th

今天報的 papers

  • Taylor's Law for Human Linguistic Sequences (簡單)
  • Hierarchical Neural Story Generation (稍難)

Taylor's Law for Human Linguistic Sequences

Main Idea

To capture lexical fluctuation

Evaluate Method

  1. set of words \( W \).

  2. corpus \( X = X_1, X_2, X_3,\ldots,X_N \), where \(X_i \in W\).

  3. segment length \( \Delta t \), segment \( X \) into \(\lceil X/\Delta t \rceil\) segments.

For every word \(w_i \in W\), count the occurrence in every segment. Then calculate the mean \(\mu_{w_i}\) and standard  deviation \(\sigma_{w_i}\).

Example

 

\(X\) = "I love you. We love you. She loves you."

\(\Delta t\) = 3

Segment 1 Segment 2 Segment 3 mu sigma
I 1 0 0 1/3 0.58
love 1 1 0 2/3 0.58
you 1 1 1 1 0
we 0 1 0 1/3 0.58
she 0 0 1 1/3 0.58
loves 0 0 1 1/3 0.58

Taylor's Law

\sigma \propto \mu^{\alpha}
σμα\sigma \propto \mu^{\alpha}
\sigma = \hat{c}\mu^{\hat{\alpha}}
σ=c^μα^\sigma = \hat{c}\mu^{\hat{\alpha}}

Fact: for i.i.d. process, \(\alpha\) = 0.5

Find parameter

\sigma = \hat{c}\mu^{\hat{\alpha}}
σ=c^μα^\sigma = \hat{c}\mu^{\hat{\alpha}}

Fit \(\hat{c}\) and \(\hat{\alpha}\) to linear function in log-log coord by least-square method

Discovery

  1. words with greatest fluctuation were keywords. Eg. whale, captain and sailor in Moby Dick.
  2. Taylor exponent \(\alpha\) is different for different kinds of data. Written text v.s. Child-directed speech
  3. \(\hat{\alpha}\) increases as \(\Delta t\) increases.

Discovery (Cont'd)

Taylor Exponent:

training data > SoTA word-level LM generated > 0.50

Hierarchical Neural Story Generation

Story-telling

  1. ​model long-range dependencies

  2. follow high-level plot

Big Structure (Hier'cal)

  1. generate premise (or called prompt)

  2. Seq2seq from premise to story 

Big Structure (Hier'cal)

  1. generate premise (or called prompt) → Gated Conv. Net

  2. Seq2seq from premise to story →ConvS2S

Conv S2S

Dataset

Reddit /r/WritingPrompts

 

One person writes prompt, other responses a story.

A prompt can have multi response.

Model fusion

access to pretrained S2S model (weak on following premise)

 

in order to improve on this, ConvS2S has to capture relationship btwn. premise and story. (kind of  boosting)

Speech Lab Paper Report 10/08

By qitar888

Speech Lab Paper Report 10/08

  • 258