Paper Report
陳家陞 Oct. 8th
今天報的 papers
- Taylor's Law for Human Linguistic Sequences (簡單)
- Hierarchical Neural Story Generation (稍難)
Taylor's Law for Human Linguistic Sequences
Main Idea
To capture lexical fluctuation
Evaluate Method
-
set of words \( W \).
-
corpus \( X = X_1, X_2, X_3,\ldots,X_N \), where \(X_i \in W\).
- segment length \( \Delta t \), segment \( X \) into \(\lceil X/\Delta t \rceil\) segments.
For every word \(w_i \in W\), count the occurrence in every segment. Then calculate the mean \(\mu_{w_i}\) and standard deviation \(\sigma_{w_i}\).
Example
\(X\) = "I love you. We love you. She loves you."
\(\Delta t\) = 3
| Segment 1 | Segment 2 | Segment 3 | mu | sigma | |
|---|---|---|---|---|---|
| I | 1 | 0 | 0 | 1/3 | 0.58 |
| love | 1 | 1 | 0 | 2/3 | 0.58 |
| you | 1 | 1 | 1 | 1 | 0 |
| we | 0 | 1 | 0 | 1/3 | 0.58 |
| she | 0 | 0 | 1 | 1/3 | 0.58 |
| loves | 0 | 0 | 1 | 1/3 | 0.58 |
Taylor's Law
Fact: for i.i.d. process, \(\alpha\) = 0.5
Find parameter
Fit \(\hat{c}\) and \(\hat{\alpha}\) to linear function in log-log coord by least-square method


Discovery
- words with greatest fluctuation were keywords. Eg. whale, captain and sailor in Moby Dick.
- Taylor exponent \(\alpha\) is different for different kinds of data. Written text v.s. Child-directed speech
- \(\hat{\alpha}\) increases as \(\Delta t\) increases.


Discovery (Cont'd)
Taylor Exponent:
training data > SoTA word-level LM generated > 0.50
Hierarchical Neural Story Generation
Story-telling
-
model long-range dependencies
-
follow high-level plot
Big Structure (Hier'cal)
-
generate premise (or called prompt)
-
Seq2seq from premise to story
Big Structure (Hier'cal)
-
generate premise (or called prompt) → Gated Conv. Net
-
Seq2seq from premise to story →ConvS2S

Conv S2S
Dataset
Reddit /r/WritingPrompts
One person writes prompt, other responses a story.
A prompt can have multi response.
Model fusion
access to pretrained S2S model (weak on following premise)
in order to improve on this, ConvS2S has to capture relationship btwn. premise and story. (kind of boosting)
Speech Lab Paper Report 10/08
By qitar888
Speech Lab Paper Report 10/08
- 258