the #ww1 and #covid19fr projects
data science for the humanities
frédéric clavert / university of luxembourg / firstname.lastname@example.org / @inactinique
who am I?
// from monetary history
// to european integration history and digitized archive
// to digital memory studies
data science: nothing new?
history and quantitative methods
The school of the Annales (France)
and its second generation of historians
- Furet & Daumard, 1959
- Garelli & Gardin, 1961
- Prost, 1974 (lexicography)
then, what's new?
the nature of the data we are dealing with
- big data platforms artefacts
the centenary of the first world war on twitter
- 1st april 2014 - 1st december 2019
- 9 million+ tweets collected
- +/- 1.5 million users
- 2/3 of retweets / 1/3 of original tweets
seems to be a lot
it is not
historian facing an unmanageable ocean of data
Der Wanderer über dem Nebelmeer (C. D. Friedrich)
how to read
9 millions tweets?
the machine reads it for you
[...] whereas what we really need is a little pact with the devil: we know how to read texts, now let's learn how not to read them. Distant reading: where distance, let me repeat it, is a condition of knowledge: it allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes -- or genres and systems.
an unbroken link to individual tweets
distant reading in practice
number of tweets per day in the #ww1 corpus (01.04.2014-01.12.2019)
estimation of the language repartition (french / english)
hierachichal descending classification
(Reinert 1983 & 1993: théorie des mondes lexicaux)
historical research and social media in times of pandemic
« You are the primary source: COVID-19 Story-Collecting Initiatives »
We are living in extraordinary times. Every South Australian is experiencing a truly global, history-making event, with both shared and unique perspectives. The History Trust of South Australia aims to document and collect objects that are connected to the experiences of people in our state during the pandemic—preserving the present for the future.
History Trust of South Australia (May 2020)
harvesting primary sources in a world of data (in crisis)
Harvesting tweets about the pandemic
- Only French hashtags (because of the 1%)
- Re-use of the #ww1 savoir-faire => fast answer
- Re-use of the #ww1 server
- collecting tweets as long as possible
- observing the memorialization of the crisis
- memory in the making
- with d. paci (ca foscaria)
the question of the memory of the crisis was discussed as soon as the lockdown started
temporality of crisis /
temporality of history
- are my research an example of presentism? Une vitrine de saison ou la mise en mémoire du Covid-19 (Philippe Mesnard, AOC)
- Didier Fassin : « Avec le coronavirus, notre vision du monde s’est rétrécie comme jamais »
as a conclusion
digital history as bricolage
La pensée sauvage (1962), Claude Levi- Strauss
- intellectual bricolage : concrete thinking allowing social organisation and collective rebalancing, when scientific thinking can lead to destablilization of a social order
- digital bricolage is hence here understood as an (academic) answer to technological disruption
how to carry your own research, while tools, methods, and even primary sources (its form and its volume) are fastly changing whereas you are not able to read / understand all the litterature you should read and understand.
digital history pitfalls
why twitter? the digitization / born digital shadows
risks of dealing with twitter data / born digital sources
what is a balanced corpus?
the illusionary order
François Furet et Adeline Daumard, « Méthodes de l’Histoire sociale: les Archives notariales et la Mécanographie », Annales ESC 14 (4), 1959, pp. 676‑693.
Prost, Antoine. 1974. Vocabulaire des proclamations électorales de 1881, 1885 et 1889. Paris : Presses universitaires de France.
Paul Garelli et Jean-Claude Gardin, « Étude par ordinateurs des établissements assyriens en Cappadoce », Annales ESC 16 (5), 1961, pp. 837‑876. En ligne: <https://www.persee.fr/doc/ahess_0395-2649_1961_num_16_5_420758>.
Boullier, Dominique. « Big data challenges for the social sciences: from society and opinion to replications ». arXiv:1607.05034 [cs], juillet 2016. arXiv.org, http://arxiv.org/abs/1607.05034.
Data Science for the Humanities: the #ww1 and #covid19fr projects
By Frédéric Clavert