Building a Fichero: New Tools, Old Documents, and Machine Learning Workflows with an Endangered Afro-Colombian Archive

— Andrew Paul Janco, Kelly López Roldán, Daniel Tubb

Key Ideas

artificial intelligence + social intelligence

Machine learning must support and expand on investments in people.

 

scale don't replace

Machine learning should augment and scale human archival and research work

situated knowledge

Build situated knowledge and empower people to avoid extractive practices

2022

 photograph and digitize the archive

Semillero

 cataloging and interpretation phase

2023

members of the project team visited the Muntu Bantu Center in Qubidó

Fichero

a complementary approach to cataloguing which uses a machine learning workflow

Semillero

Semillero

as data

Weasel: A small and easy workflow system

https://github.com/explosion/weasel

prepare

FIchero

prepare

transcribe

FIchero

prepare

transcribe

process

FIchero

prepare

transcribe

process

publish

FIchero

Future Directions: LLMs for HTR post correction

Future Directions: Ensemble of Kraken Models

Hand-Type Classifier

Text

Thank you!