Data Carpentry
Social Sciences and Humanities Using R
OpenRefine
Annika Rockenberger
Why OpenRefine?
Identify and amend messy data
Capture all actions applied to your data
Reverse any action
No modification of raw data
Apply tidy cleaning actions to other data sets
Local application, not cloud service
Graphical user interface
Features
Open Source with a BSD-3 license
(r
edistribution and use in source and binary forms, with or without modification, are permitted)
Works with large files (100.000 rows)
Keeps data private unless you want to
share it!
Today's Session
Facets
Data formats and transforming data formats
Filtering and sorting
Transforming data with GREL
(General Refine Expression Language)
Using scripts to apply operations on
other files
Exporting project and data