Introduction to Git and GitHub
For R software development
Christopher Gandrud
Session 1
9:30-10:30
Git/GitHub for R Software Development
Session 2
10:30-12:00
Developing Statistical Software using IQSS Best Practices
Best Practices for Building Social Science R Packages (Room S050)
Learning Goals
- Understand importance of version control and open software development
- Create a new git repository for an R software package, commit changes, and push updates to GitHub
- Use GitHub to collaborate, including addressing merge conflicts
What is ?
A version Control System
What is Version Control?
a system that records changes to a file or set of files over time so that you can recall specific versions later
Version Control
https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
Why is Version control important for software development?
Oh No!
A
B
C
Review and revert back to previous versions
A
B
C
A
No Problem
Oh No!
Tryout features/bug fixes without disturbing main codebase...
...until you're ready
Reproducible software development
Clear attribution of work
What is GitHub?
a web service for hosting your Git Repositories and managing software projects (e.g. bug tracking, team organisation)
GitHub
Public GitHub repositories allow you to develop your software in the open
- Transparency
- Benefit from community collaboration
- Distribute code
# in R
devtools::install_github("username/reponame")
(some) Terminology!
Repository
A directory that has been "initialized" so that git can record changes made to files within it.
Often shortened to "repo".
Repository
An initialized repository has a .git file in it.
add (stage)
Adding files to the git index allows you to commit them. . .
. . . What is commit?
Commit
Record changes to a repository
A
B
C
diff
Examine changes between commits
Branch
A lineage of commits
Master Branch
The "main" history of commits, usually thought of as "stable"
Multiple Branches
You can work on multiple branches at the same time.
Checkout
Switch to a different branch
Note: make sure you have committed all changes before checking out a branch
Merge
Join two or more branches (histories)
Merge Conflict
When merged branches have different changes on the same line of code, a conflict is created
Merge Conflict
You must determine how to resolve merge conflicts and then commit the changes
Local Repo
A git repository stored on a local machine
Remote Repo
A git repository stored remotely
(e.g. on GitHub)
Pull
Updating a local repository with changes in a remote repo
Push
Updating a remote repository with changes in a local repo
Clone
Copy a repo into a new directory
Fork
Copy a repository (without the ability to commit changes to the original)
Pull request
Request to update a remote repo with changes from a fork
Set up
Maby ways to interact with Git
There are lots of ways to interact with Git
Terminal
There are lots of ways to interact with Git
GitHub Desktop
There are lots of ways to interact with Git
GitKraken
There are lots of ways to interact with Git
Rstudio
Today we will use GitHub Desktop, Rstudio (& terminal)
But feel free to work with others. You can switch between them easily
Create a GitHub User account
Install gitHub Desktop
-
Install GitHub Desktop: https://desktop.github.com/
- Note: macOS and Linux users already have git.
- GitHub desktop makes setup easier.
Open GitHub Desktop and sign in with your GitHub credentials
Make Sure you have RStudio Installed
Create a new Repo & Commit
Terminal
Rstudio
# change to directory you want to store the repo in
cd [WHERE YOU WANT THE REPO]
# make new directory
mkdir NewRepo
# git initialize
cd NewRepo
git init
File > New Project
Your new REpo is just a folder on your Computer
Add files to it and make changes to them.
It's Always good to have a Readme.md file explaining the who, what, why, and how of the repo
# NewRepo
[YOUR NAME]
## Motivation
This repository is a test.
It's should be written in Markdown with a text editor (like RStudio)
add and Commit changes
Terminal
git add .
git commit -am "new README"
RStudio
Stage and click commit
Create a Remote on GitHub & Push
On GitHub, Create a new Remote repo
Connect Remote and local Repos
Terminal
git remote add origin https://github.com/USERNAME/NewRepo.git
git push -u origin master
RStudio
git remote add origin https://github.com/USERNAME/NewRepo.git
git push -u origin master
Add GitHub Username and Password
You completed your first Push
Commit and Push Again
Terminal
After saving changes to a file...
git add .
git commit -am "second commit"
git push origin master
RStudio
Stage and click commit
Commit
Push
Create and Merge a Branch
Find out what branch you are on
Terminal
RStudio
git branch
GitHub
Create and Checkout New Branch
Terminal
RStudio
git checkout -B NewBranch
[Daily Build Version: https://dailies.rstudio.com/]
Make changes in the branch and commit as before
You can also sync the branch with GitHub by pushing as before. Just change the name from master to the branch name
Merge branch into master
Terminal
RStudio
Make sure you have committed all changes
# switch back to master
git checkout master
# merge NewBranch
git merge NewBranch
# switch back to master
git checkout master
# merge NewBranch
git merge NewBranch
Collaborate
Add the person next to you as a Collaborator on your GitHub Repo
Add their Username
Both of you make changes to the repo, commit them, and push the changes.
The collaborator may want to make changes on GitHub or clone the repo locally.
Before you can push, you may need to pull your collaborator's changes.
Terminal
RStudio
git pull origin master
Resolve Merge conflicts
Create a conflict:
both of you make changes to the same line of the same file and try to sync with the remote.
Conflict!
Merge conflict in README.md, so . . .
Conflict!
Open README.md
Delete what you don't want to keep.
Save, commit and push.
Conflict!
The local version
The remote version
This hash uniquely identifies a commit
Take a break
R package development best practices at 10:30
Introduction to Git and GitHub
By Christopher Gandrud
Introduction to Git and GitHub
- 2,400