Unit 2 / Day 2

DH 102: Data in the Humanities
October 18, 2016

Prof. Mackenzie Brooks

Cool data stuff:

Agenda:

  • Unit 1 feedback
  • Project/methodology review
  • Intro to structured data
  • Intro to Unit 2 data

Unit 1 feedback:

  • Survey says: don't stress - dive in
  • But also, experiment and read up on tool/method

For next time:

  • Documentation should be more about the steps in the process, less of a first-person narrative
  • Explore website as medium
  • Think more about the relevance/usefulness of your visualizations
  • AND the gaps in your data
  • More tools, different data

Methodology review:

Introducing structured data!

Unit 1 was about the "bag of words."

 

Unit 2 is about relationships.

Name Year State
Andrew 2019 Georgia
Abby 2020 New Jersey
Sam 2017 North Carolina
Chris 2020 Georgia

Tabular data

https://en.wikipedia.org/wiki/Relational_database

https://www.dlsweb.rmit.edu.au/toolbox/knowmang/content/models/relational_model.htm

http://legacy.alexandria.ucsb.edu/gazetteer/ContentStandard/version3.2/GCS3.2-guide.htm

SQL = Structured Query Language

SELECT column_name,column_name
FROM table_name
WHERE column_name operator value;

SELECT * FROM students
WHERE year='2020';
SELECT * FROM students
WHERE year='2020'
OR year='2019';

Data types:

  • String/characters
  • Integer
  • Decimal
  • Boolean (T/F)
  • Date
  • Time

.csv = comma separated value

.xlsx = Excel workbook

.tsv = tab separated value

.json = Javascript Object Notation

.xml = Extensible Markup Language

RDF = resource description framework

 

File formats:

Let's meet the data!

For Thursday:

  • Watch the linked data video
  • Review the Programming Historian lessons
  • Install Open Refine
  • Review the new data & accompanying materials
Made with Slides.com