-
Gilgamesh as Data
-
ACH 2024
-
NewToolsOldDocs-Oct24
-
DiScho
-
Bento
-
AIBraries
-
Building a Fichero: New Tools, Old Documents, and Machine Learning Workflows with an Endangered Afro-Colombian Archive
This paper describes outcomes and challenges in human-scale document processing. We discuss a workflow that begins with document preservation, moves through text recognition, and ends with a catalogue that demonstrates capabilities of LLMs for research and archival work, while remaining attuned to the vision of research partners. Until 2022, the Istmina Circuit Court archive, with documents from the 1870s to 1930s, was rotting, disorganized, and in garbage bags. Yet, this archive is a crucial source of Afro-Colombian history in an often-marginalized region of the Chocó in Colombia. In 2023, seven young people from Istmina and Quibdó worked with the Muntú Bantú Foundation, a community center focused on Afro-diasporic memory. With researchers from various universities, they were able to digitize the archive, which is available online at the British Library. While the project was successful, the digitization has enabled new workflows to catalogue and interpret the archive. This paper explores these workflows. Throughout, we are interested in a key problem of equity in knowledge production: How can new tools be used to the benefit of local knowledge-producers? Our paper focuses on the work of cataloguing archival materials, a first step in enabling local researchers (and others) to make meaning. Project interns catalogued 330 Case Files and wrote a book of micro-history. Yet, t
-
Open Street Map + Accessibility
-
Dangers of Parrots
-
Introduction to Large Language Models
-
eScriptorium Helpers
-
Time Blended Embeddings
-
NewToolsOldDocs-Nov2
-
Projecktoj-futures
-
Documents
-
FMB
-
Collections as Data
-
Minimal Digital Editions
-
Mastodon?
This talk aims to provide some context around current events at Twitter, reasons to stay, and reasons to leave so that you and others can make informed decisions and know their options.
-
Finding Places in Text
-
Intro to NLP
-
GAM
-
Computer Vision and DH
-
deck
-
ANWR
-
deck
-
Annotation and Index Work
-
deck
-
Research Image Collections
-
Sawchen Lecture
-
deck
-
New Language for NLP
-
Digital Whiteboards
-
Cadet
Disrupting Digital Monolingualism workshop, June 16, 2020
-
deck
-
deck
-
deck
-
deck
-
deck
-
From Collections to Data: Turning Raw Text into Structured Research Data
In this section, we will discuss the process of transforming the raw text and metadata transcribed by Prozhito into datasets that can be used in specific experiments.
-
deck
-
spaCy Universe
-
Survey of SlavicDH Projects
-
Exhibits
-
Tropy
-
Computer Vision
-
tensorflow
-
History ATS Presentation