Introduction to Git and GitHub

For R software development

Christopher Gandrud

Session 1

9:30-10:30

Git/GitHub for R Software Development

Session 2

10:30-12:00

Developing Statistical Software using IQSS Best Practices

Best Practices for Building Social Science R Packages (Room S050)

Learning Goals

  • Understand importance of version control and open software development
  • Create a new git repository for an R software package, commit changes, and push updates to GitHub 
  • Use GitHub to collaborate, including addressing merge conflicts

What is           ?

A version Control System

What is Version Control?

a system that records changes to a file or set of files over time so that you can recall specific versions later

Version Control

https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control

Why is Version control important for software development?

Oh No!

A

B

C

Review and revert back to previous versions

A

B

C

A

No Problem

Oh No!

Tryout features/bug fixes without disturbing main codebase...

...until you're ready

Reproducible software development

Clear attribution of work

What is    GitHub?

a web service for hosting your Git Repositories and managing software projects (e.g. bug tracking, team organisation)

GitHub

Public GitHub repositories allow you to develop your software in the open

  • Transparency
  • Benefit from community collaboration
  • Distribute code
# in R
devtools::install_github("username/reponame")

(some) Terminology!

Repository

A directory that has been "initialized" so that git can record changes made to files within it.

Often shortened to "repo".

Repository

An initialized repository has a .git file in it.

add (stage)

Adding files to the git index allows you to commit them. . . 

. . . What is commit?

Commit

Record changes to a repository

A

B

C

diff

Examine changes between commits

Branch

A lineage of commits

Master Branch

The "main" history of commits, usually thought of as "stable"

Multiple Branches

You can work on multiple branches at the same time.

Checkout

Switch to a different branch

Note: make sure you have committed all changes before checking out a branch

Merge

Join two or more branches (histories)

Merge Conflict

When merged branches have different changes on the same line of code, a conflict is created

Merge Conflict

You must determine how to resolve merge conflicts and then commit the changes

Local Repo

A git repository stored on a local machine

Remote Repo

A git repository stored remotely

(e.g. on GitHub)

Pull

Updating a local repository with changes in a remote repo

Push

Updating a remote repository with changes in a local repo

Clone

Copy a repo into a new directory

Fork

Copy a repository (without the ability to commit changes to the original)

Pull request

Request to update a remote repo with changes from a fork

There are lots more.

See:

https://git-scm.com/

Set up

Maby ways to interact with Git

There are lots of ways to interact with Git

Terminal

There are lots of ways to interact with Git

GitHub Desktop

There are lots of ways to interact with Git

GitKraken

There are lots of ways to interact with Git

Rstudio

Today we will use GitHub Desktop, Rstudio (& terminal)

But feel free to work with others. You can switch between them easily

Create a GitHub User account

Install gitHub Desktop 

 

 

Open GitHub Desktop and sign in with your GitHub credentials

Make Sure you have RStudio Installed

Create a new Repo & Commit

Terminal

Rstudio

# change to directory you want to store the repo in
cd [WHERE YOU WANT THE REPO]

# make new directory
mkdir NewRepo

# git initialize
cd NewRepo
git init
File > New Project

Your new REpo is just a folder on your Computer

Add files to it and make changes to them.

It's Always good to have a Readme.md file explaining the who, what, why, and how of the repo

# NewRepo

[YOUR NAME]

## Motivation

This repository is a test.

It's should be written in Markdown with a text editor (like RStudio)

add and Commit changes

Terminal

git add .
git commit -am "new README"

RStudio

Stage and click commit

Create a Remote on GitHub & Push

On GitHub, Create a new Remote repo

Connect Remote and local Repos

Terminal

git remote add origin https://github.com/USERNAME/NewRepo.git
git push -u origin master

RStudio

git remote add origin https://github.com/USERNAME/NewRepo.git
git push -u origin master

Add GitHub Username and Password

You completed your first Push

Commit and Push Again

Terminal

After saving changes to a file...

git add .
git commit -am "second commit"
git push origin master

RStudio

Stage and click commit

Commit

Push

Create and Merge a Branch

Find out what branch you are on

Terminal

RStudio

git branch

GitHub

Create and Checkout New Branch

Terminal

RStudio

git checkout -B NewBranch

[Daily Build Version: https://dailies.rstudio.com/

Make changes in the branch and commit as before 

You can also sync the branch with GitHub by pushing as before. Just change the name from master to the branch name

Merge branch into master

Terminal

RStudio

Make sure you have committed all changes

# switch back to master
git checkout master

# merge NewBranch
git merge NewBranch
# switch back to master
git checkout master

# merge NewBranch
git merge NewBranch

 Collaborate

Add the person next to you as a Collaborator on your GitHub Repo

Add their Username

Both of you make changes to the repo, commit them, and push the changes.

The collaborator may want to make changes on GitHub or clone the repo locally.

Before you can push, you may need to pull your collaborator's changes.

Terminal

RStudio

git pull origin master

Resolve Merge conflicts

Create a conflict:

both of you make changes to the same line of the same file and try to sync with the remote.

Conflict!

Merge conflict in README.md, so . . .

Conflict!

Open README.md

Delete what you don't want to keep.

Save, commit and push.

Conflict!

The local version

The remote version

This hash uniquely identifies a commit

Take a break

R package development best practices at 10:30

Introduction to Git and GitHub

By Christopher Gandrud

Introduction to Git and GitHub

  • 2,400