Gistr
A web experiment for cultural evolution models
MINT, MPI für Menschheitsgeschichte
Sébastien Lerique / 29th May 2018
IN VIVO
Claidière et al. (2014)
Moussaïd et al. (2015)
IN VITRO
Adamic et al. (2016)
Online data
Transmission chains
Short-term cultural evolution
Mesoudi and Whiten (2004)
Leskovec et al. (2009)
Empirical bind
Realistic content
Control over data-generation
Computational analysis
Already coded
Do-it-by-hand
Simple setting
Danescu-Niculescu-Mizil et al. (2012)
Moussaïd et al. (2015)
Lauf et al. (2013)
Cornish et al. (2013)
Claidière et al. (2014)
Web experiments
Sequence alignments
Lerique & Roth (2017)
Requirements fulfilled
Control over experimental setting
Fast iterations
Scale
Exp. 1
MemeTracker, WikiSource,
12 Angry Men, Tales,
News stories
Exp. 2
Memorable/non-memorable quote pairs
(Danescu-Niculescu-Mizil et al., 2012)
Exp. 3
Nouvelles en trois lignes
(Fénéon, 1906)
Web-based experiments
Experiment setup
reading and writing time \(\propto\) number of words
Transformation model
At Dover, the finale of the bailiffs' convention. Their duties, said a speaker, are "delicate, dangerous, and insufficiently compensated."
depth in branch
At Dover, the finale of the bailiffs convention,their duty said a speaker are delicate, dangerous and detailed
At Dover, at a Bailiffs convention. a speaker said that their duty was to patience, and determination
In Dover, at a Bailiffs convention, the speaker said that their duty was to patience.
In Dover, at a Bailiffs Convention, the speak said their duty was to patience
At Dover, the finale of the bailiffs' convention. Their duties, said a speaker, are "delicate, dangerous, and insufficiently compensated."
At Dover, the finale of the bailiffs convention,their duty said a speaker are delicate, dangerous and detailed
Sequence alignments
Needleman and Wunsch (1970)
AGAACT-
| ||
-G-AC-G
AGAACT
GACG
Finding her son, Alvin, 69, hanged, Mrs Hunt, of Brighton, was so depressed she could not cut him down
Finding her son Arthur 69 hanged Mrs Brown from Brighton was so upset she could not cut him down
Finding her son Alvin 69 hanged Mrs Hunt of - - Brighton, was so depressed she could not cut him down
Finding her son Arthur 69 hanged Mrs - - Brown from Brighton was so upset she could not cut him down
Apply to utterances using NLP
At Dover, the finale of the bailiffs convention, their duty said a speaker are delicate, dangerous and detailed
At Dover, at a Bailiffs convention. a speaker said that their duty was to patience, and determination
At Dover the finale of the - - bailiffs convention - - - - their duty At Dover - - - - at a Bailiffs convention a speaker said that their duty said a speaker are delicate dangerous - - - and detailed - - - - - - - was to patience and - determination
At Dover the finale of the - - bailiffs convention |-Exchange-1------| their duty
At Dover - - - - at a Bailiffs convention a speaker said that their duty
said a speaker are delicate dangerous - - - and detailed -
|-Exchange-1------------------------| was to patience and - determination
said a speaker are delicate dangerous |-E2----|
|E2| a speaker - - - said that
said -
said that
\(\hookrightarrow E_1\)
\(\hookrightarrow E_2\)
Extend to build recursive deep alignments
What does this afford us?
Detailed behaviours
Frequency
Frequency
Frequency
|chunk|
|chunk|
Deletion
Insertion
Replacement
Position in \(u\)
\(|u|_w\)
Number of operations vs. utterance length
Susceptibility vs. position in utterance
Deletions tend to gate other operations
Insertions relate to preceding deletions
Stubbersfield et al. (2015)
Bebbington et al. (2017)
Links the low-level with contrasted outcomes
Lexical evolution (1)
Step-wise
Susceptibility
Feature variation
Lexical evolution (2)
Along the branches
Realistic content
Control over data-generation
Computational analysis
Already coded
Do-it-by-hand
Simple setting
Empirical (un)bind
Quantitative analysis of changes
Structural changes from exchanges
Relating insertion and deletion chunks
Inner structure of transformations
Sequence alignments of semantic parses
Further work
In vivo applications to more complete data sets (social networks)
Sentence processing \(\leftrightarrow\) Higher level evolution
Feedback loops: utterance distribution \(\leftrightarrow\) detailed transformations
Long-lived chains with recurring changes
Semantic parsing and NLP methods on the inner structure
Connect to the constitution of meaning in interaction and context
Openings
Semantics
Thank you
Supervision
Organising
Jean-Pierre Nadal & Camille Roth
Olivier Morin
Questions
You!
Challenges with meaning
Can you think of anything else, Barbara, they might have told me about that party?
I've spoken to the other children who were there that day.
S
B
Abuser
The Devil's Advocate (1997)
?
Strong pragmatics (Scott-Phillips, 2017)
Access to context
Theory of the constitution of meaning
Challenges
Live experiment
Data quality
#participants |
#root utterances |
tree size |
Duration |
Spam rate |
Usable reformulations |
53 | 49 | 2 x 70 |
54 | 50 | 25/batch |
48 | 49 | 70 |
64min | 43min | 37min/batch |
22.4% + 3.5% | 0.8% + 0.6% | 1% + 0.1% |
1980 | 2411 | 3506 |
Exp. 1 | Exp. 2 | Exp. 3 |
-
First large-scale launch
- Bugs and customer service
- Mistaken UI affordances
-
Extensive rewrite
- Automated tests
- Pilots evaluating the UI
- Pilots sampling root utterances
– “There is no hope for peace, it is a lost cause”
→ “There is no lose hope that rara ra to op”
– “My Government's overriding priority is to ensure the stability of the British economy”
→ “My governments overall liability is to sort out the... not sure.”
Alignment optimisation
\(\theta_{open}\)
\(\theta_{extend}\)
\(\theta_{mismatch}\)
\(\theta_{exchange}\) by hand
All transformations
Hand-coded training set size?
Train the \(\theta_*\) on hand-coded alignments
Simulate the training process: imagine we know the optimal \(\theta\)
1. Sample \(\theta^0 \in [-1, 0]^3\) to generate artificial alignments for all transformations
2. From those, sample \(n\) training alignments
3. Brute-force \(\hat{\theta}_1, ..., \hat{\theta}_m\) estimators of \(\theta_0\)
4. Evaluate the number of errors per transformation on the test set
Test set
10x
10x
\(\Longrightarrow\) 100-200 hand-coded alignments yield \(\leq\) 1 error/transformation
Gap open cost \(\rightarrow \theta_{open}\)
Gap extend cost \(\rightarrow \theta_{extend}\)
Item match-mismatch
Example data
Immediately after I become president I will confront this economic challenge head-on by taking all necessary steps
immediately after I become a president I will confront this economic challenge
Immediately after I become president, I will tackle this economic challenge head-on by taking all the necessary steps
This crisis did not develop overnight and it will not be solved overnight
the crisis did not developed overnight, and it will be not solved overnight
original
This, crisis, did, not, develop, overnight, and, it, will, not, be, solved, overnight
this, crisis, did, not, develop, overnight, and, it, will, not, be, solved, overnight
this, crisis, did, not, develop, overnight, and, it, will, not, be, solved, overnight
crisi, develop, overnight, solv, overnight
tokenize
lowercase & length > 2
stopwords
stem
The crisis didn't happen today won't be solved by midnight.
crisi, happen, today, solv, midnight
d = 0,6
Utterance-to-utterance distance
Aggregate trends
Size reduction
Transmissibility
Variability
Lexical evolution - POS
Step-wise
Susceptibility
Gistr
By Sébastien Lerique
Gistr
- 1,076