Where, why, & how
Ryan Clement | Data Services Librarian | Reed College Library
Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The ethics of sharing & re-using data
Some major data repositories
Cleaning & preparing secondary data
How do you properly cite data?
image source: http://www.nature.com/news/specials/datasharing/images/datasharing.jpg
"Who would care about this?"
And who would care about keeping it?
What type of organization are they?
Educational institutions, government organization, private company, etc.
If not government, how valuable is the data?
And who would pay for it?
Are there privacy/confidentiality issues?
And at what level of observation do you need the data?
Part of the Institute for Social Research at University of Michigan
First attempt at openly sharing data amongst researchers (started with election studies data)
Curated, digitized, diverse historical data sets
IPUMS Project Goals
Collect and preserve data and documentation
Harmonize data
Disseminate the data absolutely free!
Column locations and widths for each variable (if necessary)
Definitions of different record types
Response codes for each variable
Codes used to indicate nonresponse and missing data
Exact questions and skip patterns used in a survey
Other indications of the content and characteristics of each variable
Image by Monica Duke (http://blogs.ukoln.ac.uk/sagecite/2011/05/16/data-citation-principles-harvard/)
From International Studies Quarterly, King and Zeng, 2007, p. 209:
Gary King; Langche Zeng, 2006, "Replication Data Set for 'When Can History be Our Guide? The Pitfalls of Counterfactual Inference'" hdl:1902.1/DXRXCFAWPK UNF:3:DaYlT6QSX9r0D50ye+tXpA== Murray Research Archive [distributor]
From International Studies Quarterly, King and Zeng, 2007, p. 209:
Gary King; Langche Zeng, 2006, "Replication Data Set for 'When Can History be Our Guide? The Pitfalls of Counterfactual Inference'" hdl:1902.1/DXRXCFAWPK UNF:3:DaYlT6QSX9r0D50ye+tXpA== Murray Research Archive [distributor]
Attribution
From International Studies Quarterly, King and Zeng, 2007, p. 209:
Gary King; Langche Zeng, 2006, "Replication Data Set for 'When Can History be Our Guide? The Pitfalls of Counterfactual Inference'" hdl:1902.1/DXRXCFAWPK UNF:3:DaYlT6QSX9r0D50ye+tXpA== Murray Research Archive [distributor]
Verifiability
From International Studies Quarterly, King and Zeng, 2007, p. 209:
Gary King; Langche Zeng, 2006, "Replication Data Set for 'When Can History be Our Guide? The Pitfalls of Counterfactual Inference'" hdl:1902.1/DXRXCFAWPK UNF:3:DaYlT6QSX9r0D50ye+tXpA== Murray Research Archive [distributor]
Findability
Author
Publication Date
Title
Publisher/Distributor or Location
Persistent Identifier (DOI, hdl, ark, URI)
Location (URL)
Version
Access Date
Feature or Subset Name
UNF (or other validation key)
Zotero
No dataset type
Use Document/Report/etc; just be consistent
Dataset coming
EndNote
Dataset type
Lots of useful fields (unit of observation, data type, separate producer/distributor)