Organizing
Your Research
Data
Andrea Baruzzi (Sciences)
Sarah Elichko (Social Sciences)
Keeping your files, folders, and research life in order
Decisions
creating
editing
saving
sharing
Files
&
Folders
Approaches that will generally work with tools you already use
(Mac or PC, Google Drive, Dropbox, etc.)
Today's focus:
What do we mean by organized?
• Human-friendly
• Machine-friendly
• Reliable
Why bother?
• Reduce time and energy spent on logistics
• Improve collaboration
• Opportunity to think through your work
• Can facilitate using digital tools for analysis
Folders
Files
Storage
Folders
Files
Storage
Folders
provide structure
(because without them, all you'd have is this...)
Folder Structure: Common Patterns
/[Program]/[Year]/[Aspect of program]
/[Project]/[Type of file]/[Data collector name]/[YYYYMMDD]
/butterfly/tabular/mcneill/20160117
/butterfly/images/mcneill/20160117
/2016-2017/fall/hist091/paper1
/2016-2017/fall/soan098/proposal
/[Academic Year]/[Semester]/[Course]/[Assignment]
← Jan 17, 2016
/RIAs/2016-2017/applications
/RIAs/2016-2017/lesson_plans
Folder Structure: Best Practices
• Format dates, numbers, and names so they sort well (sortable)
Machine-friendly:
Human-friendly:
• Choose names that make sense to you (skimmable)
• Use consistent abbreviations + terms (predictable)
• Consider adding a readme file (understandable)
Folder Structure: Best Practices
Format dates so they sort well
// Machine-friendly //
To sort correctly, use: YYYY-MM-DD
YYYY-MM-DD
2011-10-13
20111013
Year-Month-Day
2015-10-12
20151012
Folder Structure: Best Practices
Use leading zeroes so numbers sort well
// Machine-friendly //
Usual practice:
-
List as entered: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
-
When sorted: 1, 10, 11, 12, 2, 3, 4, 5, 6, 7, 8, 9
Better practice:
-
List as entered: 001, 002 ... 010, 011, 012
-
When sorted: 001, 002 … 010, 011, 012
Folder Structure: Best Practices
Format names so they sort well
// Machine-friendly //
Typical sort order*:
1. Underscores
2. Numbers
3. Letters
* on most systems, including Google Drive and Mac
Folders sorted alphabetically:
_2017-07-14
2017-07-14
2017-08-01
ASAP
SOON
TODAY
Folder Structure: Best Practices
Try it out
• Adjust dates so they're sortable:
YYYY-MM-DD
2017-06-23
• Add leading zeroes to numbers
in folder names:
001, 002 ... 010, 011, 012
Folder Structure: Best Practices
Research-Group-A / Project-1
Research-Group-A / Project-2
2016-2017 / Fall / ANTH-043E
2016-2017 / Fall / SOCI-055C
// Human Friendly //
Choose names that will make sense at a glance
(to you and to colleagues)
Folder Structure: Best Practices
// Human Friendly //
Use consistent abbreviations + terms (predictable)
• If there's a standard out there, use it
(e.g. department abbreviations)
• If not, document your decision (see below)
Add a readme file to folders (understandable)
• Short & simple - what do you mean by xyz
• Gift to your future self (/colleagues)
Creating Your Folders
1.) What are the key structures that organize your work?
Semesters
Courses
Projects
Groups / Collaborations
Do any repeat? These might make good top-level folders.
Research-Group-A / Project-1
Research-Group-A / Project-2
Meetings
Assignments
2016-2017 / Fall / ANTH-043E
2016-2017 / Fall / SOCI-055C
Creating Your Folders
2.) Do any structures "nest" inside others?
Semesters
Courses
Projects
Groups / Collaborations
These might make good mid-level folders.
Meetings
Assignments
Research-Group-A / Project-1
Research-Group-A / Project-2
2016-2017 / Fall / ANTH-043E / paper1
2016-2017 / Fall / SOCI-055C / paper2
Creating Your Folders
3.) What kinds of documents do you generate while working?
Drafts
Procedures
Images
Numerical data
Meeting notes / minutes
Notes (ideas, etc.)
Lists (to-do, to-read, etc.)
Do any repeat? If so, these might make good sub-folders.
Research-Group-A / Project-2 / MeetingNotes
2016-2017 / Fall / ANTH-043E / paper1 / notes
Class notes
Putting it all together: Folder Structure Examples
/butterfly/tabular/mcneill/20160117
/butterfly/images/mcneill/20160117
/Research-Group-A/Project-2/MeetingNotes
/2016-2017/fall/anth-043e/paper1/notes
/2016-2017/fall/hist091/paper1
/2016-2017/fall/soan098/proposal
/RIAs/2016-2017/applications
/RIAs/2016-2017/lesson_plans
/Research-Group-A/Project-1
/Research-Group-A/Project-2
Folders
Files
Storage
Folders
Files
Storage
Files
Which would you prefer to find on your laptop...
six months from now?
File Naming Best Practices
• Include some descriptive info (skimmable)
• Be consistent (predictable)
Human-friendly:
• Use versioning - avoid "final final FINAL.pdf" (sortable)
• Keep it simple - no special characters (compatible)
Machine-friendly:
Reliable:
• Choose a future-friendly file format (when possible)
File Naming Best Practices
/ Research-Group-A / Project-2 / MeetingNotes
Include some descriptive information
even if files are inside well-labeled folders (skimmable)
File Naming Best Practices
/ 2016-2017 / Fall / ANTH-043E / paper1
Use versioning - avoid "final final FINAL.pdf"
• Sequential numbers indicate change over time (sortable)
• Text labels explain versions (skimmable, understandable)
File Naming Best Practices
Keep it simple - no special characters
• Good: letters, numbers, underscores (_), dashes (-)
even better: all lower-case
• Avoid: other punctuation marks & special characters, spaces
readings_class-04.pdf
Readings!!/class 4.pdf
data-survey04.xls
data.survey.4.xls
File Naming Best Practices
• Good: letters, numbers, underscores (_), dashes (-)
even better: all lower-case
• Avoid: other punctuation marks & special characters, spaces
readings_class-04.pdf
Readings!!/class 4.pdf
data-survey04.xls
data.survey.4.xls
Try it out
Using sample data or your data,
improve a few file names.
Examples:
File Naming Best Practices
Use a future-friendly file format (when possible)
Aim for file formats that are:
- Non-proprietary
- Open, documented standard
- Commonly used by research community
- Recommended here (Library of Congress)
File Naming Best Practices
Use a future-friendly file format (when possible)
Spreadsheets / tabular data: CSV (not Excel/.xlsx)
Text: TXT, PDF/A, or ODF (not Word/.docx)
Still images: TIFF, JPEG 2000
Moving images: MPEG-4, AVI, MXF
Sounds: WAVE, AIFF, MP3, MXF
Statistics: ASCII, DTA, POR, SAS, SAV
Storage
where is your work?
Storage Best Practices
Ask: When might you need this again?
• Later this semester
• Applying for internships (next spring?)
• Taking a related course (next year?)
• Considering graduate school
• Applying for a job or graduate school
Storage Best Practices
Already on faculty/staff computers
Personal folders
Group folders (AODocs)
Larger, more complex storage needs
Putting it into practice
Something > nothing
• Start small
• Build gradually
• Don't psych yourself out
Questions
Andrea Baruzzi
Sarah Elichko
ITS Help Desk
What do we mean by organized?
• Human-friendly
• Machine-friendly
• Reliable
Skimmable
• Easily read at a glance
Understandable
• To your colleagues
• To your future self
Predictable
• Easy to keep up
What do we mean by organized?
• Human-friendly
• Machine-friendly
• Reliable
Sortable
• Can be arranged correctly
Compatible
• Names that won't confuse systems
Organizing Your Research
By Swarthmore Reference
Organizing Your Research
- 717