Investigating the Evolution of Early Modern European Drama with Digital Methods
Luca Giovannini
(Potsdam)
Ringvorlesung Digital Humanities im Fokus: Methoden, Anwendungen und Perspektiven
Rostock, 28.04.2025
Today
- Theory
- Corpus
- Methods
- Experiments
- Conclusions
Theory
Research question
How did the different European dramatic literatures develop their own peculiar features during the early modern era?

Corneille, Cinna (1639)

Shirley, The Gentleman of Venice (1639)
A possible approach
Franco Moretti
"Modern European literature: A geographical sketch"
New Left Review
206 (1994), 86-86

Core argument
According to Moretti, the evolutionary mechanism of literary history is similar to the biological one:
- New animal/vegetal species are born as a consequence of their movement in new spaces 🦜
- New literary forms are born thanks to the new political-geographical spaces emerging throughout European history 📚

Speciationist model (Darwin's finches)
1400 1500 1600 1700 1800
According to Moretti, the same process of speciation happened in European tragedy
🇫🇷
🇩🇪
🇮🇹
🇬🇧
🇪🇸
Common model of European tragedy (based on Senecan and medieval heritage)

Discontinuous, fractured, the European space functions as a sort of archipelago of (national) sub-spaces, each of them specializing in one formal variation.
(Moretti 2013 [1994]: 12)
Concurring explanations
- Other scholars emphasise unity over diversity:
- Küpper (2018): early modern drama as a cultural
net in which textual elements such as plots, characters, and motifs circulate and are periodically extracted, reworked and reused in individual
plays - Clubb (1990): diffusion of theathergrams
- Küpper (2018): early modern drama as a cultural
'Irregular' theatres 🇪🇸 🇬🇧
- influenced by medieval theatre practices
- emphasis on performance
- looser structure
'Regular' theatres 🇮🇹 🇫🇷
- associated with humanist theatre
- meant mostly to be read
- codified structure after Aristotle
Early modern drama aesthetics
According to Lotman's typology of culture:
irregular theatres → example-based culture
regular theatres → rule-based culture

How to empirically test the evolution of early modern drama?
Corpus
Corpus parametres
-
150 plays in five languages (🇮🇹 🇫🇷 🇪🇸 🇩🇪 🇬🇧)
-
time span: 1561-1710 (150 years)
-
purposefully non-canonical approach
-
balance between representativeness and practicability

Birth locations for the corpus authors (via Wikidata, incomplete)
- DraCor (Drama Corpora) is an open access platform for research on dramatic literature.
- Currently: 21 “programmable corpora” in 16 different languages; more than 4000 texts in XML-TEI format.
- Applications and tools for CLS + easy API access (e.g. computing network metrics and speaker distribution, SPARQL searches on Linked Open Data, etc.)

A model for corpus construction: DraCor
DraCor Summit
Berlin, 01-05.09.2025
Tutorials, barcamp, round tables & keynotes, corpus conference, workshop on computational drama analyis, and much more!
⚠️ CfPs open until 06.05 ⚠️
👉 summit.dracor.org 👈

Text onboarding pipeline
Integrating the corpus within the DraCor environment
- Creating offline custom DraCor corpora from local TEI/XML files through the Docker technology
- Enables use of all DraCor tools for CLS analysis
- Ensures reproducibility of results

Early Modern Drama Corpus (EmDraCor)


Homepage

Methods
Operationalising drama
- Identifying key component of drama:
- dialogue
- characters
- plot
- Surveying popular options for computational drama analysis (after Willand et al. 2017):
- topic modelling of dramatic genres
- exploration of network structures
- analysis of character speeches (stylometry).
- Here: a mix of quantitative formalist approaches
- content- and language-agnostic
- form-oriented
- Boris Yarkho
- Member of the Moscow Formalist Circle
- Methodology for a Precise Science of Literature (*2006)
- Solomon Marcus
- Mathematical Poetics (1970)
- Hartmut Ilsemann
- “Computerized Drama Analysis” (1995)
- DRAMANALYS.exe

Models and forerunners
"Measurements can be taken of any quantifiable aspect of a text, but figuring out the significance of that metric to an understanding of the text, or better, mapping that metric onto a preexisting critical concept (such as style, plot, or theme), is crucial to making sense of what is being measured" (Algee-Hewitt 2017: 759)
Individuating suitable metrics to be extracted from EmDraCor texts
Drama metrics
collected via the DraCor API or re-computed independently
(following Algee-Hewitt 2017, Szemes and Vida 2024, Trilcke et al. 2017)
Type | Features |
---|---|
Network | Size; Average clustering coefficient; Density; Average path length; Average degree; Diameter; Maximum degree; Number of edges; Number of connected components; Ratio, average degree to maximum degree; Ratio, maximum degree to number of characters; Weighted degree distribution; Protagonism; Mediateness |
Cast and speech | Average characters per scene; Average length of character speech; Speech intensity; Gendered speakers; Collective speakers |
Size | Number of acts; Number of Segments; Word count, whole text; Word count, spoken text; Word count, stage directions; Number of prose lines; Number of verse lines |
Plot | All-in index; Final scene size; Drama change rate |
Desiderata
- go beyond the comparison of single measures
- capture different structural dimensions of
drama in an holistic fashion
Solution
- collect metrics in feature vectors or play embeddings, i.e., multidimensional vectors containing a variety of measures embodying different textual aspects of the dramatic text
example_play_1 = {10, 2, 0.5714, 3, 16792, ...}
example_play_2 = {26, 6, 0.3422, 2, 40098, ...}
num_speakers | num_speakers_ groups |
density (SNA) | avg_degree (SNA) | word_count_sp | ||
Example_play_1 | 10 | 2 | 0.5714 | 3 | 16792 | ... |
Example_play_2 | 26 | 6 | 0.3422 | 2 | 40098 | ... |
How vectors work
Methodological reflection
- drama vectorisation as an expression of the ’cultural technique of flattening’ (Krämer 2023) typical of DH
- not as a reductionist attempt to overtly simplify the complexity of theatrical texts
- rather as a prosecution of formalist morphological thinking with modern computational methods
- interplay between operationalisation and flattening:

Experiments
Why vectorisation?
- offers a pathway to an empirical falsification of Moretti’s account on the progressive diversification of European early modern drama
- allows an investigation on “how dramatic forms change and are preserved within a framework of cultural evolution” (Szemes and Vida 2024: 18)
Ground hypothesis
- distances between vectors can be assumed to express the degree of (dis)similarity between them
- two plays whose vectors are quite far one from the other will also be different in terms of form
- the branching of dramatic traditions can be seen through the ’progressive distancing’ of the plays’ vectors
1. Distances
Options for measuring distances between vectors

pairwise centroid-based
First implementation: pairs
- Divide the 150 play embeddings into groups according to their normalised year (30 and 15 years long timespans)
- Within each group, calculate the Euclidean distance between each pair of plays.

3. Assume the average of these distances to represent the evolution of 'formal difference' within each time frame

the lighter the colours, the more different groups of plays are (in terms of formal differences)
Second implementation: centroids
2. Clusters
1. Representing play vectors as points on a Cartesian plane via dimensionality reduction methods (here: PCA)
2. Visually identifying clusters based on formal/structural similarities

Some clustering seems to emerge towards the end, but it is still not enough


Plays are plotted according to successive 30-years-long timeframes
3. Patterns
Approach
- Focus on each individual metric and follow its evolution across the different sub-corpora
- Compute shifts in absolute value for each metric across the time span
- Verify whether the variation of a given feature has become highly distinctive of a national tradition against the others
- Caveat: not all metric variations are significant!

First steps towards a quantitative profiling of dramatic traditions
- 🇬🇧 : enhanced network connectedness, dispersion of the protagonist role (cf. Algee-Hewitt 2017), increase in female cast, progressive implementation of French liaison des scènes
- 🇩🇪 : shift towards sparser network, increase in segmentation (Neoclassical influence)
- 🇪🇸 : expanding and more intricately connected dramatic networks, absence of central mediating figures, increase in non-gendered speaker, decreasing drama change rate
- 🇮🇹 : progressive concentration of the protagonist role in
one or few characters, increase of stage directions (CdA) - 🇫🇷 : streamlined network models, more stability in stage configurations
Evolutionary trends within the subcorpora
A reproduction experiment
- Concerns about reliability of results → attempt at reproduction (same methods, similar but not unrelated data, cf. Schöch 2023)
- Data: EngDraCor (source: Early Print Project) and FreDraCor (source: Théâtre Classique)
- Method: vectorisation and distance experiments
- Goal: compare results to those obtained on the corresponding EmDraCor subsets (EmDraCor-eng and EmDraCor-fre)
Results
- Trends correspondence: 🇬🇧: 62%, 🇫🇷: 30%
- due to small-sample bias, different markup quality
- divergence is however limited (±5%)
- Generally, trends recognised in the smaller EmDraCor sub-corpora are amplified versions of patterns found in the broader one (EngDraCor/FreDraCor)
- Results from EmDraCor are not fully replicated, but its trend can still highlight trends in the formal development of European theatre
- Just like in EmDraCor, compressing plays into meaningful vectors of formal features shows relevant differences between the larger EngDraCor and FreDraCor.
- Even a simple PCA underscores the ’regularity’ of French theatre as against the ’irregularity’ of the English one.

Conclusions
- Findings partially support Moretti’s thesis of increasing formal diversification in early modern drama.
- Vector distance analysis is inconclusive, but PCA projections suggest a branching of dramatic traditions, though clusters are not sharply defined.
- Metric distribution analysis provided the most valuable insights into dramatic evolution: trends in plot structures, character features, and network arrangements indicate the emergence of distinct dramatic traditions.
- The reproduction experiment on English and French DraCor supports the validity of EmDraCor results.
Results
- theoretical framework
- other models of cultural evolution beyond speciation might be more suitable (cf. Sobchuk 2023)
- corpus specifics (size, languages, etc.)
- operationalisation of the concept of drama
- choice of metrics employed
Limitations
- construction of a multilingual, open-access, machine-actionable corpus of
150 TEI/XML-encoded plays - empirical reassessment of a previous theory via quantitative methods ('triangulation')
- further development of a key methodology (vectorisation of text based on formal features)
Contributions
Thanks!
giovannini@uni-potsdam.de
@lucagiovannini.bsky.social

Bibliography
-
Algee-Hewitt, M. (2017). "Distributed character: Quantitative models of the English stage, 1550–1900". New Literary History, 48(4), 751-782.
-
Allison, S.; Heuser, R.; Jockers, M.; Moretti, F.; Witmore, M. (2011). "Quantitative Formalism: An Experiment". Stanford Literary Lab Pamphlet 1.
-
Clubb, L. G. (1990). Italian Drama in Shakespeare’s Time. New Haven: Yale UP.
-
Fischer, F., Börner, I.; Göbel, M.; Hechtl, A.; Kittel, C.; Milling, C., Trilcke, P. (2019). "Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama". In: DH2019 Book of Abstracts. University of Utrecht, 2019.
-
Jannidis, F. (2022). "Digitale Literaturwissenschaft. Zur Einführung". In: Jannidis, F. (ed.), Digitale Literaturwissenschaft. Stuttgart: J. B. Metzler, pp. 1-16.
-
Herrmann, J. B.; Bories, A.-S.; Frontini, F.; Jacquot, C.; Pielström, S.; Rebora, S.; Rockwell, G.; Sinclair, S. (2023). "Tool Criticism in Practice: On Methods, Tools, and Aims of Computational Literary Studies". Digital Humanities Quarterly 17, 2.
-
Krämer, S. (2023). "The Cultural Technique of Flattening". Metode 1.
-
Küpper, J. (2018). The Cultural Net: Early Modern Drama as a Paradigm. Berlin; Boston: De Gruyter.
-
Lvoff, B. (2021). "Distant Reading in Russian Formalism and Russian Formalism in Distant Reading". Russian Literature, 122, 29-65.
-
Moretti, F. (2013). Distant reading. London: Verso.
-
Schöch, C. (2023). "Repetitive research: a conceptual space and terminology of replication, reproduction, revision, reanalysis, reinvestigation and reuse in digital humanities". International Journal of Digital Humanities, 5 (2): 373–403.
-
Sobchuk, O. (2023). "Evolution of Modern Literature and Film", in J. Tehrani, J. Kendal, and R. Kendal, eds., The Oxford Handbook of Cultural Evolution (online).
-
Szemes, B., and Vida, B. (2024) "Tragic and Comical Networks: Clustering Dramatic Genres According to Structural Properties". In M. Andresen and N. Reiter, eds., Computational Drama Analysis: Reflecting on Methods and Interpretations. Berlin; Boston: De Gruyter.
-
Willand, M., Trilcke, P., Schöch, C., Rißler-Pipka, N., Reiter, N., & Fischer, F. (2017). "Aktuelle Herausforderungen der Digitalen Dramenanalyse". In DHd 2017 Book of Abstracts.
#Ringvorlesung: Rostock
By luca-giovannini
#Ringvorlesung: Rostock
- 105