From collection search to collections as data

Tim Sherratt・@wragge・#HOTA2019

seeing differently

search for 'radio' in Trove newspapers

comparing 'radio', 'telegraph', & 'wireless

search for 'radio' showing place of publication

words from newspapers describing origins of immigrants

objects in the NMA, by year of production

number of items described in NAA by top-level function

number of items digitised in NAA by top-level function

Australian aviators in Trove newspapers

White Australia policy records in the NAA

@TroveNewsBot

what is GLAM data?

  • metadata (not content)
  • structured text / data
  • unstructured text
  • images
  • derived data
  • user generated data
  • activity data
  • born digital data

some varieties of GLAM data

metadata

(data about collections)

seeing what we're not allowed to see

structured data

(think data with rows and columns)

GLAM CSV Explorer

unstructured text

(lots of words)

text from Trove journals

images

(of images and text)

3,471 Bulletin editorial cartoons

derived data

(data you extract from data)

redactions from ASIO files

user-generated data

(data added by the public)

where do you get GLAM data?

  • harvest data from APIs...
  • extract data from web pages by screen scraping...

but if it's not already packaged for download, you can...

🤯

it all seems too hard!

the GLAM Workbench is here to help!

Jupyter notebooks?

notebooks can be tools or hacks...

...or even simple apps

asking questions of data

what happened to radio in 1955?

exploring data

what can I do with the NMA API?

playing with data

what happens when you view interjections as tweets?

what becomes possible when I...?

one possible pathway...

  1. visualise searches in Trove newspapers
  2. find patterns, ask questions
  3. zoom in on points of interest
  4. harvest article text and metadata
  5. explore harvested data in detail

more resources...

From collection search to collections as data

By Tim Sherratt

From collection search to collections as data

  • 531
Loading comments...

More from Tim Sherratt