Humanities Research Part II: Secondary Sources

Digital Abundance?

This brings with it opportunity — imagine being able to search for keywords across millions of documents, leading to radically faster search times — but also challenge, as the number of electronic documents increases exponentially.

Identify the Problem

Topic Modeling

Optical Character Recognition (OCR)

https://pro.europeana.eu/page/issue-13-ocr

Keyword Searching

OCR/Keyword Research

  • Greater access to (some) resources
  • New ways of searching across documents
  • Can find information more quickly
  • Can ask new questions that you couldn't explore before
  • False positives / false negatives
  • Accuracy of content 
  • Often takes longer to browse/skim
  • Lose context that comes from browsing sources/shelves of books

What do we gain?

What are some challenges?

Best Practices/Solutions

  1. Always skim/browse documents for event-based history (convenience + comprehensive)
     
  2. Do multiple searches for important information (singular/plural, abbreviated forms, alternate spellings, etc.)
     
  3. Be upfront about searches and how you found/didn't find information
     
  4. Humanities scholars working with database managers and/or crowdsourcing OCR transcriptions

Extra Credit

Extra Credit