data science for the humanities
frédéric clavert / university of luxembourg / frederic.clavert@uni.lu / @inactinique
// from monetary history
// to european integration history and digitized archive
// to digital memory studies
The school of the Annales (France)
and its second generation of historians
the nature of the data we are dealing with
Boullier, 2016
seems to be a lot
it is not
historian facing an unmanageable ocean of data
Der Wanderer über dem Nebelmeer (C. D. Friedrich)
[...] whereas what we really need is a little pact with the devil: we know how to read texts, now let's learn how not to read them. Distant reading: where distance, let me repeat it, is a condition of knowledge: it allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes -- or genres and systems.
distant reading
with
an unbroken link to individual tweets
number of tweets per day in the #ww1 corpus (01.04.2014-01.12.2019)
estimation of the language repartition (french / english)
hierachichal descending classification
(Reinert 1983 & 1993: théorie des mondes lexicaux)
15.03.2020
Source: IFPH.
« You are the primary source: COVID-19 Story-Collecting Initiatives »
https://www.google.com/maps/d/viewer?mid=1FMGFrGeIoxVNCxESEVkII9sPP5ZIC3Pb&usp=sharing
We are living in extraordinary times. Every South Australian is experiencing a truly global, history-making event, with both shared and unique perspectives. The History Trust of South Australia aims to document and collect objects that are connected to the experiences of people in our state during the pandemic—preserving the present for the future.
History Trust of South Australia (May 2020)
«
the question of the memory of the crisis was discussed as soon as the lockdown started
DNA, 17.07.2020
as a conclusion
La pensée sauvage (1962), Claude Levi- Strauss
how to carry your own research, while tools, methods, and even primary sources (its form and its volume) are fastly changing whereas you are not able to read / understand all the litterature you should read and understand.
why twitter? the digitization / born digital shadows
risks of dealing with twitter data / born digital sources
what is a balanced corpus?
the illusionary order
François Furet et Adeline Daumard, « Méthodes de l’Histoire sociale: les Archives notariales et la Mécanographie », Annales ESC 14 (4), 1959, pp. 676‑693.
Prost, Antoine. 1974. Vocabulaire des proclamations électorales de 1881, 1885 et 1889. Paris : Presses universitaires de France.
Paul Garelli et Jean-Claude Gardin, « Étude par ordinateurs des établissements assyriens en Cappadoce », Annales ESC 16 (5), 1961, pp. 837‑876. En ligne: <https://www.persee.fr/doc/ahess_0395-2649_1961_num_16_5_420758>.
Boullier, Dominique. « Big data challenges for the social sciences: from society and opinion to replications ». arXiv:1607.05034 [cs], juillet 2016. arXiv.org, http://arxiv.org/abs/1607.05034.