
& Contexts

Tim Sherratt · @wragge

Play along!

We're all digital historians now...

NAA WWI service records

WEST Ernest Robert : Service Number - 5091 : Place of Birth - Benambra VIC : Place of Enlistment - Sale VIC : Next of Kin - (Mother) WEST Mrs F M

Name: WEST Ernest Robert
Service Number: 5091
Place of Birth: Benambra VIC
Place of Enlistment: Sale VIC
Next of Kin: (Mother) WEST Mrs F M

structured data!

The online interface

The big picture

But is this the whole picture?

235,466,138 results!



What are we searching?

New perspectives with data



Putting the history into context using digital collections.


Putting digital collections into context by exploring their history.

digital skills modules


Putting the history into contexts using digital collections.


Putting digital collections into contexts by exploring their history.

recontextualising collections

See: Sherratt & Bagnall, 'The people inside' in Seeing the Past with Computers

The Real Face of White Australia

Data repository

What's missing?

  • Search engines lie

  • Relevance is overrated

  • Access is never open

  • Websites are never published

critical approaches to collections

Search engines lie



Trove has a history

Simple or Advanced?

Query Results Explanation
divorce 1,148,424 Searches full text
title:divorce 214,800 Searches headlines only

Fields or fulltext

Query Results Explanation
white OR australia 47,406,097
white australia 5,617,345 Same as white AND australia
"white australia" 155,908 Search for phrase (with stemming)
text:"white australia" 151,707 Search for phrase (no stemming)
"white australia"~0 149,762 Search for phrase (no extra words)
text:"white australia"~0 146,302 Search for phrase (no extra words and no stemming)

De-fuzzify phrases

Query Results Explanation
naturalisation 239,057 Stemmed to 'naturalis'
naturalization 14,853,488 Stemmed to 'natur'
text:naturalisation 126,539 No stemming
text:naturalization 23,677 No stemming

Fun with stemming


OCR quality

5.9% of articles

> 70% of family notices have corrections

What gets corrected?

Non-English language content

52 newspapers

340km of records

What's in RecordSearch?

47% of series have no item descriptions

that's 50km of records the file titles of the 37% of records that have been described at item level

Relevance is overrated

Time for a game...


search Trove from Twitter

Alternatives to search

3,471 Bulletin editorial cartoons

compiled into handy PDFs

One page per day

Hansard interjections

Access is never open

What can I do?

  • Open?
  • Digitised?
  • Copyright restrictions?
  • Downloadable?
  • Full text?
  • Image resolution?
  • Machine-readable data?

Seeing what's closed

number of items described in NAA by top-level function

number of items digitised in NAA by top-level function


Websites are never published

How many newspaper articles are in Trove?

You can change things...

  • Improve discovery
  • Add knowledge
  • Build collaborations

containing 1,975,150 items

103,207 Trove lists

Make lists

more than 200 lists about lawnmowers

using 2,201,090 unique tags

9,370,614 tagged items

Add tags


Use Zotero to capture data

  • Trove newspapers
  • National Archives
  • Libraries Tasmania

Links break...

so save them now!

404 error

Hack interfaces


Need help?

See also...

Collections & Contexts

By Tim Sherratt

Collections & Contexts

  • 731