The Digital Humanities and the History of the European integration

Frédéric Clavert (PhD / Paris-Sorbonne / LabEx EHNE)

http://histnum.hypotheses.org/ - @inactinique - frederic@clavert.net

Introduction

Why bothering about the Digital Humanities?

  • Discreet digitisation of our practices...
  • ...that are introducing blackboxes in our research...
  • ...which may undermine the results of our research.

The perfect blackbox...

If two researchers do the same request, they will obtain different results depending on several criteria, including their location, language, etc.

 

  • The Digital Humanities in context: the datafication of the World
  • What does "digital" mean for the historians?
  • The example of the History of European integration

The Digital Humanities in context

The datafication of the World

Datafication?

«To datafy a phenomenon is to put it in a quantified format it can be tabulated and analysed »

 

Mayer-Schönberger, Viktor, et Kenneth Cukier.
Big Data: A Revolution That Will Transform How We Live, Work, and Think.
Boston: Houghton Mifflin Harcourt, 2013, p. 72.

Datafication?

  • A long term trend, prior to the digital era
    • The US Navy in the XIXth century
  • Quantification of all kind of elements, including those that do not appear usefull at first sight (but can be re-used for other purposes)
  • Big Data vs Samples
  • Currently speeding up with digitisation (quantification) / computing (analysis) / network (big data)

Datafication

  • Google Books as a digitization project / Ngram as datafication of books (ie transforms them into quantifiable data)
  • GPS: datafication of places
  • On-line social networks as datafication of social relations

Examples

Datafication of primary sources

  • Everything can be datafied
  • Everything that is datafied is potentially a primary source
    • Social networks as primary source about social relations, daily life, social phenomenom...
  • Digitisation of "analog"/already existing primary sources.

Is the datafication of Humanities new?

  • Case of French historical sciences:
    • Furet / Daumard: dealing with great volume of information (Annales, 1959)
    • Garelli: comparing different kind of data to get new informations (Annales, 1961)

Researcher facing a sea of data

David K. Friedrich,
Der Wanderer über dem Nebelmeer, 1817

Inflation de l'information
et des sources

Example taken from a paper by Dan Cohen
(Roy Rosenzweig CHNM / Digital Public Library of America)

  • Johnson Administration: several 10 000 of archives. Can be humnaly dealt with.
  • Clinton Administration: several millions of e-mail. Writing the history of the Clinton administration will imply the use of computing.

Digital Humanities

Digital humanities is an area of research and teaching at the intersection of computing and the disciplines of the humanities. Developing from the fields of humanities computing, humanistic computing, and digital humanities praxis, digital humanities embraces a variety of topics, from curating online collections to data mining large cultural data sets. Digital humanities (...) currently incorporates both digitized and born-digital materials and combines the methodologies from traditional humanities disciplines (...) and social sciences with tools provided by computing (such as data visualisation, information retrieval, data mining, statistics, text mining) and digital publishing.

 

Source: Wikipedia EN

The rise of "Big Data"

We define Big Data as a cultural, technological, and scholarly phenomenon that rests on the interplay of:
(1) Technology: maximizing computation power and algorithmic accuracy to gather, analyze, link, and compare large data sets.
(2) Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims.
(3) Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy.

danah boyd and K. Crawford, “CRITICAL QUESTIONS FOR BIG DATA: Provocations for a cultural, technological, and scholarly phenomenon,” Information, Communication & Society, vol. 15, no. 5, pp. 662–679, Jun. 2012.

What does "digital" mean for the historians?

Is the archive
still tasting the same?

Arlette Farge, Le goût de l’archive, Paris, Le Seuil, 1989

Arlette Farge describe the link between historians and their archives. Something intimate, which implies a long work of copying documents.

 

In the archive center itself, this link - with the use of Digital Camera, of databases, of software like zotero - is not the same anymore.

Finding primary sources

Criticizing the primary sources

The risk of an Illusionary Order

Milligan Ian, « Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010 », Canadian Historical Review, 1 décembre 2013, vol. 94, nᵒ 4, pp. 540‑569.

  • Illusionary authority by the volume
  • Decline of non digitized primary sources
  • Decline of topics / places that are poor in data

 

The key problem of the digitisation process

The example of the Werner corpus and how bad text recognition leads to bad research.

Reading Primary Sources

  • Franco Moretti, Graphs, maps and trees (Verso, 2007): how not to read primary sources
  • Distant reading / close reading
    • Example

Visualizing primary sources

The narration of history

  • The diversification of the narration of history
  • The nex relationship between an academic publication and its primary sources
  • Example: http://www.cvce.eu/

The distant reading of Historiographye

One example: The notion of "Europeanisation".

The example of the history of European integration

Available digitised primary sources

  • CVCE - http://www.cvce.eu/
  • AEI - http://aei.pitt.edu/
  • And many more, including the historical archive of the European Union in Florence, the media library of the European commission, etc.

The illusionary order of European integration history websites?

  • Based on Google Scholar
    • cvce.eu (formerly ena.lu) and AEI have been rapidly used. AEI is published in 2003 and ena.lu in 2005: first quotations in Scholar have appeared the same years;
    • Decline of ena.lu starting in 2011, because replaced by cvce.eu. Cvce.eu is not yet as used as ena.lu, but its use is fastly increasing.

The illusionary order of European integration history websites?

Some remarks on this use:

  • There's no explanation about the success rate of text recognition on those websites, nor about the softwares used. Sometimes, there are some precisions for some documents;
  • The selection criteria of the documents to be digitised and published are not clear.
    • CVCE: are precised at the project-scale, not institution-wide - a problem if your research does not fit one of the CVCE's project
    • AEI: a bit clearer: official archives of European institutions, some grey litterature. The collect of the digitised documents is crowdsourced.

Quick Conclusion

  • Using those databases can only be a first approach, unless your research perfectly fit on of the institution's research projects
  • Had no time to confront the use of those digitised sources with the use of "analog" sources - cannot conclude if the use of those two websites leads to a biased research.

The illusionary order of European integration history websites?

Long term conclusion

 

  • The aim is not to discouraged the use of those databases, but to encourage a proper use, methodologically integrated into our research.
  • The aim it to open a debate, in order to define some methodological aspects:
    • Which documents should be digitised in priority?
    • How to use them properly?

The illusionary order of European integration history websites?

Back to the Werner Corpus as a concrete example.

DH and the history of European integration

By Frédéric Clavert

DH and the history of European integration

  • 2,644