Stylometry
or how we learned that J.K. Rowling was also Robert Galbraith
Daniil Skorinkin, German Palchikov
Yerevan, August 2023
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/10674727/photo_2023-08-13_08.24.32.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/10254257/transparent_up.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/6673938/blok.png)
Скажем сразу
-
Стилометрия — это не магия и не «волшебная пуля»
-
Есть случаи, когда никакая статистика вам автора не определит
-
Но есть метод, который при определенных условиях работает (т.е. не на конкретном наборе авторов/ книг, а регулярно и на любом языке)
-
Есть применения за пределами авторства
Lets say straight
- Stylometry is no magic and no silver bullet
- There are cases when you simply can't use stylometry (lots of them)
- But there is a method that works universally given certain conditions (not an ad hoc method)
- It has many uses beyond authorship issues: translations, author collaboration, genre styles, stylochronology
Чему вы научитесь
What you'll learn
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/10253532/2dim.png)
Измерять стилометрическую близость
Measure stylometric distances
Визуализировать эту близость
Visualize stylometric distances
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/4862881/galbraith_mds.png)
Dimensionality reduction methods (PCA MDS tSNE etc)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/10254282/daniilskorinkin_Consensus_100-300_MFWs_Culled_0__Classic_Delta_C_0.5__001.png)
Hierarchical philogenetic tree style dendrograms
Weighted graphs
(Weighted networks)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/10254015/Screenshot_2023-02-27_at_07.16.33.png)
Apply it to different languages
Применять к разным языкам
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/6769390/sholokhov_small_300.png)
Go beyond authorship
Применять за пределами вопросов авторства
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/4241191/Screen_Shot_2017-10-19_at_09.25.15.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/7916537/rolling-svm_haggard_100.png)
Stylochronology
Collaboration
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/4270231/Screen_Shot_2017-10-25_at_23.12.39.png)
Translation
Постилометрим ChatGPT
Try generated texts
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/10674742/Screenshot_2023-08-13_at_08.51.42.png)
Узнаем историю стилометрии / Learn how Stylometry came to be
-
1851 — A. De Morgan suggests mean word-length as an authorship feature
-
1873 — New Shakespeare Society (Furnival, Fleay et al)
-
1887 — T. Mendenhall, The Characteristic Curves of Composition, the first known work on quantitative authorship attribution
![](https://s3.amazonaws.com/media-p.slid.es/uploads/641147/images/3447460/De_falso_credita_et_ementita_Constantini_Donatione_declamatio__1_.jpeg)
Спасибо за внимание
Thank you for your attention
Stylometry Yerevan
By danilsko
Stylometry Yerevan
Stylometry Yerevan
- 182