R and Git[Hub]
Anton Antonov (@tonytonov)
So, science
Science isn't about why -- it's about why not
Keep calm
&
adopt a cat
technologies
Q: Is it going to hurt?
A: Yes
Dropbox, Google docs
Are designed to synchronize files
- Try editing the same place simultaneously
- Try working offline and then sync with others
- Good luck figuring out who did that particular change
Git
Git is a VCS (version control system)
created by Linus Torvalds, 2005
Git contains full project evolution
in a series of filesystem snapshots
Git is open source, reliable, secure and fast,
making it a de-facto standard for developers
Git is a sufficiently advanced technology,
therefore it's magic*
*Arthur C. Clarke
Git basics
repository project folder
commit save snapshot
- Commit is a "checkpoint"
- Contains info: who and when did it
- Human-readable commit messages
script.R
script_rev2.R
script_rev6_comments.R
script_rev22_ver5_latest.R
script.R
script.R
script.R
script.R
commit
commit
commit
script.R
script.R
script.R
script.R
commit
commit
commit
John Doe, 13:37 04/10/17
"Changed NA handling"
John Doe, 20:17 04/10/17
"Fixed incorrect input
(now using character, not factor)"
John Doe, 00:42 05/10/17
"ggplot style tweaked via theme()"
Git basics
remote repository that is somewhere else
pull grab commits from remote
clone initial pull from remote
push send commits to remote
- Remote serves as a backup and provides sync
- Commit often, push when ready to share
remote
1. clone
2. commit
3. push
1. clone
4. pull
GitHub
- Very popular web-based remote
- Yep, it's free
- Active (R) community
- Visibility
- Ease of collaboration
- Great web interface
- Git flow adapted for various purposes
- Did I mention it's totally free?
RStudio integration
Alternatives: Git clients (SourceTree, etc.), terminal
diff
- Go to GitHub and create a repo
- Use RStudio to clone it
- Do some changes, commit (x2-5)
- Upload changes via push
Your turn
GitHub
fork make a copy of GitHub repo
pull request send changes from fork to original repo
- Fork any public repository to tailor the project to your specific needs
- Great way to contribute to open source, FTW
- Convenient to maintain your project
- Go to https://github.com/tonytonov/tcts-git/issues/2
- Take repository one comment above and fork it
- Use RStudio to clone your fork
- Do some changes, commit
- Upload changes via push
- Send a pull request with proposed changes
- Review someone else's pull request for your repo
- Check out your GitHub profile, be proud
Your turn
GitHub issues
R packages
A great way to share R code
Learn by observing
- R code
- Help files
- Package description
- Imports/exports
- Tell Git not to track these files
Bottomline
Pros
- no revision hell
- transparent collaboration
- backup on steroids
- it's hard to screw things up*
- visibility, exposure
- same instrument, many targets
Cons
- takes time & effort to learn & maintain
- same applies for everyone on your group
- has (known) limitations
Extra stuff
- More GitHub features (visualisation, analytics)
- devtools::install_github()
- Similar services: GitLab, Bitbucket
- GitHub pages
- RMarkdown, LaTeX
- Bookdown
- Integration with other services
Links
Links
Thank you!
R and Git[Hub]
By Antonov Anton
R and Git[Hub]
- 758