MARCing the Boundary: Reusing Special Collections Records in the Early Novels Dataset

Nabil Kashyap / @_nabilk

Lindsay Van Tine / @mlvantine

who is END?

From catalogers to senior faculty, undergraduates to national agencies, END represents a range of interests.

-Swarthmore English Department

-McCabe Library

-Penn Libraries, Kislak Center for Special Collections

-Council on Library and Information Resources

-Fales Library, New York University

-Price Lab for Digital Humanities

-Undergraduate researchers from Penn, Swarthmore, Haverford, Bryn Mawr, and Williams

what does END do?

-creates enhanced bibliographic metadata for early novels held at the University of Pennsylvania and other area repositories

-uses a custom MARC schema to build on existing library catalog records

-trains undergraduate catalogers to produce richly detailed metadata, both controlled and discursive

-makes a ~2000-record dataset available to researchers, in MARCXML as well as json and tabular formats

what is END data?

-MARCXML base records from library OPACs

-supplemental fields, with edition- and copy-specific features of the physical books in Penn's collection

-~2000-record dataset in MARCXML, json, and tabular formats, and focused subsets

-digitized page scans & OCRd fulltext

https://github.com/earlynovels

 

no such thing as a data dump

By Agência de Notícias do Acre [CC BY 2.0 ], via Wikimedia Commons

Boundary objects are objects which are both plastic enough to adapt to local needs ... yet robust enough to maintain a common identity ... They have different meanings in different social worlds but their structure is common enough to more than one world to make them recognizable, a means of translation.

 

Star and Griesemer (1989)

obligatory XML slide

how do heterogeneity and cooperation coexist, and with what consequences for managing information?

 

Star and Griesemer (1989)

Thank You!

Nabil Kashyap / @_nabilk

Lindsay Van Tine / @mlvantine

dh-2017

By Swarthmore Reference