Data: Big & Noisy!

Title Text

Tweet Sentiment Visualization Project:


Reviewers thought he said...

  • Big data can solve everything
  • There is no need to collect research data, just use what’s out there
  • The future is predictable, if we have enough data

What he really said...

  • Consider the limitations of Big Data
  • Do not expect to predict everything using Big Data
  • Combine insights from humans and machines

2012 HBR ...Data Scientist sexiest job in 21st century

2013 "Big Data" entry in OED

Taylor, R. S. (1962), The process of asking questions. Amer. Doc., 13: 391–396. doi: 10.1002/asi.5090130405


or size matters

  • Books in the Library of Congress represent 235 terabytes of data
  • 1994 conference presentation first mention of petabyte and said "the problems which confront the meteorologist today, will worry the arts and humanities in 10 years time."
  • A petabyte = 4 Library of Congresses
  • The Large Hadron Collider generates 1 petabyte of data every second
  • The Square Kilometre Array will generate 4 petabytes of data every second
  • 2013 = zetabyte (1 zetabyte = 250 billion DVD's)
  • By 2020, the digital universe will equal 35 zetabytes (or 44 times as big as 2009)








Andrew Prescott: Big Data in the Arts and Humanities

Fast forward to 2015

Big Data

Humanities, Sciences, Librarians

Digital Humanities

DH Projects

  • Textual Comparisons
  • Textual Visualizations
  • TEI
  • Media

Ref: : Toward Automated Discovery of Artistic Influence


Librarians add..



Data Management

Data and Librarians

Research Data Services

  • Data Management and DMP's
  • Teaching data best practices
  • Institutional Repositories
  • Open data

Introduction to data managment

Steal this idea: A library instructors' guide to educating students in data management skills by Lisa Johnson and Jon Jeffryes published in C&RL, Sept 2014: 431

Managing your data from the University of Minnesota

Create data that you and others can understand

DCC 2009

Research Data Management

  • Data Management
  • Data Curation
  • Data Information Literacy
  • Data Visualization

Data Visualization

Sources for learning

Duke Library LibGuide




Data Science - University of Washington