Transition your workflow, keep your sanity
Danielle Navarro
Sydney Research Bazaar, September 2019
slides.com/djnavarro/workflow
djnavarro
compcogscisydney.org
The year is 2009:
Danielle does not think about her project directory structure...
White space in a filename is a bad idea (partly for historical reasons)
Mistake #1
These are unnecessary copies of existing files
Mistake #2
Unnecessary files generated by MATLAB
Mistake #3
Mistake #4
Irrelevant files generated by the operating system
Mistake #5
Data stored in a format that is hard to work with unless you have a MATLAB licence
My data analysis scripts were in MATLAB so I can't run them anymore
Mistake #6
I honestly don't know why I used .eps graphics and nothing else
Mistake #7
If you don't want "future you" to be embarrassed, maybe think about your file names?
Mistake #8
Everything is dumped in one folder
Mistake #9
There is no README file!!!
Mistake #10
Inconsistent separator characters in file names
Mistake #11
Inconsistent naming scheme
Mistake #12
Names that are not very helpful to the human reader
Mistake #13
The year is 2019:
Danielle has become paranoid about her project directory structure...
Project templates
- A standard folder structure
- Documentation describing it
- Agreed conventions for usage
- Miscellaneous "nudges" to suggest good practice
Template
Project
A good template should include instructions for how to use it
e.g., https://github.com/djnavarro/newproject
If you use GitHub you can create new projects directly from your template (or from someone elses!)
Okay... so why did I organise my template to produce file structures like this?
My folder structure is organised to mirror the structure of a typical research project
Files and folders use a consistent naming scheme that is easy for humans to read and easy for machines to read
Every folder has a README
(I cannot stress this enough... future you will love you for this)
README.md
Relying on free and open source tools so that other people can use it
Best practice version control using a git repository
Cloud storage for the project using GitHub
How did I get here?
I had to learn a lot of new technical tools...
github.com
rstudio.com
r-project.org
osf.io
git-scm.com
psyarxiv.com
I've been helped by some wonderful teachers
I've relied on communities for support
twitter.com/rladiessydney
ozunconf19.ropensci.org
chdsummerschool.com
But most of all...
Transitions are hard
the gatekeepers' attempts to suppress the number of trans people allowed to transition occurred at virtually every step
indefinite periods of psychotherapy designed to evaluate whether or not they met the psychiatrists criteria
those who were allowed to begin the real-life test often faced additional obstacles as some gender identity clinics required trans people to begin their tests prior to starting hormone replacement therapy
exposed the transsexual to all sorts of discrimination, harassment and potential violence
little more than a hazing period
How you imagine it goes
How it actually goes...
Why is it hard?
Fear and anxiety
Exhaustion
Gatekeeping
The in-between problem
All models and code available as documented repositories on GitHub
I am a believer in open science. You should have used OSF
Gatekeeping via review?
We used a Bayesian approach for exploratory model building, evaluated by checking for robust performance on theoretically meaningful desiderata... blah blah blah
Give me a p-value
Gatekeeping via review?
We like music?
Your favourite music sucks
Gatekeeping via review?
Transitioning requires trade-offs
"Unlike my cisgender counterparts I was never subject to sexual harassment in maths class, but on the other hand I didn’t attend very many of my classes either because I was too busy [self-harming] ... Looking back, my unwillingness to face up to how I felt about gender looks a lot like a trade-off: purchasing a form of male privilege at the expense of my mental health"
git / GitHub
Everybody laughs because learning git is one of the hardest steps. We all go through this stage... and that's OKAY!
git / GitHub
- Hard to learn
- Manual commits
- Can be finicky
Dropbox
- Easy to learn
- Automatic backup
- Usually "just works"
- Preserves your history
- Issues threads
- Integration w. Travis etc
- Fork and modify
- History is limited
- No real analog of issues
- No CI service at all
- No analog of forks
https://happygitwithr.com/
Transitioning from Dropbox to git + GitHub was a good exchange for me because it suits the work I do.
Neither system is better in general. Your trade-off will look different to mine!
Some positive tips!
- Offload cognition
- Plan projects
- Use side projects
- Invest in your tools
- Find your community
Offloading cognition
GitHub issues threads are excellent for this: your commits can link the code changes to the issue thread
Offloading cognition
You can do this for regular scientific projects as well as software development
Offloading cognition
Planning your project
Planning your time
Email tends to prioritise other people's needs, especially the loudest people
Manage todo lists and time management explicitly (e.g., via todoist.com) gives you better control
Use your side projects...
asciify
rainbowr
... to learn skills for real ones
Invest in your tools
Surviving the long climb
Cast of characters
A cognitive psychologist with some experience with R
A grumpy psychometrician with a limited time budget
A social psychologist who wants to believe
Hey, I've been learning new things!
Awesome! What have you learned?
Hey, I've been learning new things!
I can add cat gifs to a plot?
I can draw pretty maps?
I can model associative learning as Bayesian inference over inhomogeneous Markov random fields?
Hello?
Hello? Was it something I said?
Photo by Sven Scheuermeier on Unsplash
A complicating factor:
Hey, how would you feel about posting research code to the web?
Oh hell no. People would call me stupid
What? You literally wrote a textbook on latent variable modelling and your Ph.D. research won a best paper award at Multivariate Behavior Research. You are not stupid
Yeah but my code has to be written in 20 minute consults and it's ugly. People are unkind about code
- Our code works (mostly), but it's ugly
- We have limited time & money
- It's difficult for us to make nice code, but at least we have some training
- You want me to put myself in the firing line too? Pfft
- Why should I do that????
- Byeeee....
- "Be kind: Everyone you meet is fighting a hard battle"
- Academic communities should not become Thunderdome
- Public shaming masquerading as "scientific critique" has no value. Don't do it
So... it's a long way to the top, someone might push me off a cliff, and my peers might not respect the effort if I make it?
(seriously, you are terrible at sales)
Well, maybe I exaggerate.
Still, it helps to climb with a team...
tidyverse
lme
git
BayesFactor
lavaan
car
Shiny
R markdown
learning all the things
It can be hard to find a team within a single applied field...
#rstats
But there are people who will support you
Yay!
Find your community!
#rstats
...But look, it's a long climb. Help me believe it's worthwhile
Okay, I'm back
Structural equations models? With Bayes?
Flexible experiments with Shiny apps?
Social network analysis? With beautiful pictures?
Okay, I'm listening, but...
Better tools for reproducible research?
Which packages should I prioritise?
How will journal editors & reviewers respond?
How do I move my whole lab across?
base R
tidyverse
jsPsych
Matlab
JASP
BayesFactor
.Rmd
SQL
lme
JAGS
Stan
Which packages?
Migration?
base R
tidyverse
jsPsych
Matlab
JASP
BayesFactor
.Rmd
SQL
lme
JAGS
Stan