Git

Keeping it under control.

  • Distributed*, rather than central, needs careful control.
  • Choose workflow that lends itself to catching problems early.

A rebase workflow rather than merging.

  • Stick to simple, well understood, commands (99% of the time).
  • Branch often, but branch only from master.
  • Rebase.
  • *everyone has the entire repo.

    A LITTLE ABOUT GIT(HUB)


    Everyone has a copy of everything.

    • Github saves entire repository for each user.
    • No need for individual branches for each user upstream: they belong on the individual's Github repository.
    • Branches are cheap, create them often, push them often.
    • Upstream has the "one source of truth" master branch*.


    *this is where your work should begin.

    A "simple" example


     There's a lot going on, but it's not that complicated (honest).

    SUMMARY


    1. Git networks.
    2. Rules of the game.
    3. The workflow.
    4. Rebase vs Merging.
    5. Sandbox.
    6. Makes it easier to catch mistakes.

    MERGINg MAYHEM


    HMMM REBASING




    Everyone has a copy of everything, madness?

    EVERYONE HAS A "COPY"


    Unfortunately that's not always the same copy!
    Which master is more up to date?

    EVERYONE SHOULD HAVE THE SAME COPY


    Which master is more up to date?




    ONE True master






    DISCUSS HOW CURRENTLY USING GITHUB?




    RULES OF THE GAME

    COMMANDments

    1. Keep master a previous copy of upstream/master (FF).
    2. Only branch off master.
    3. Branch often, branch for each feature.
    4. Push only to origin (your Github repository).
    5. Do rebase.
     

    HOUSEKEEPING (UPDATING MASTER)

    git fetch upstream
    git checkout master
    git ff upstream/master  # ffum
    git push origin master  # pom
    git checkout back_to_branch


    Which branch can FF to which branch? 

    ALWAYS BRANCH OFF MASTER

    git checkout master
    git checkout -b featureA 


    Clear in the Network where branches are, how many commits they contain, easily look at the diff of that commit/branch.

    MAKE BRANCHES INDEPENDENT

    Keeping concerns separate.
    Branches are CHEAP.
    git checkout master
    git checkout -b featureA
    

    Branches should have a single* purpose.
    Otherwise you're kind-of branching off featureA.
    *or at least very related, not just related to when you happened to be working

    AVOIDMENTS


    1. Don't pull
    2. Don't merge
    3. Don't commit to master

    A notablY ABSENT FRIEND


    pull ==  fetch + merge
    pull --rebase == fetch + rebase

    These are two completely distinct commands.
    Do you really need to do them in one go?

    Recipe for disaster.

    NOT INVITED TO THE PARTY


    merge


    DON't COMMIT TO MASTER


    COMMIT TO YOur own FEATURE branches.

    Keep master with only reviewed and tested commits.

    STICK TO FEW COMMANDS


    • fetch
    • ff (fast forward)
    • checkout
    • checkout -b
    • commit
    • push
    • rebase

    FETCH

    Sync up your local copy of a remote.

    • Once set up, update the local copy using.
     git fetch upstream
    

    • I then recommend updating your master branch
    git checkout master
    git ff upstream/master  # ff = merge --ff-only
    

    Note: I recommend setting up and alias, see below, for this command specifically so you don't fall into the habit of merging*.

    CHECKOUT

    So you've synced with the latest from upstream and updated your master branch. Now it's time to create something.

    Create a new branch (FROM CURRENT BRANCH):

    git checkout -b featureA

    Move to an existing branch:

    git checkout featureA

    ADD (STAGE)

    Which files should you include in your commit?
    Look at what's affected with:
    git status
    git diff
    Add all files with:
    git add -A
    or specific files with
    git add myfile.py myotherfile.py
    unstage/unadd a file
    git reset myotherfile.py 
    Note: there's also git rm to delete a file in the commit.

    COMMIT (WHEN NOT IN MASTER)


    Once you've added your files just add a message to the commit.
    git commit -m "A short commit message" 
    or write a longer message in your text editor:
    git commit

    Good practice: Short message (50 characters followed by a blank line and a longer description)
    Reason: network/history can give you a more complete story.

    Also, otherwise Linus will be upset.

    PUSH

    Once you've done some work you should update/backup your online (Github) repository with your work.
    git push origin featureA
    Now it's on Github.

    WORKFLOW part 1

    Update master
    git fetch upstream
    git checkout master
    git ff upstream/master  # ffum
    git push origin master

    Create new branch from master
    git checkout master
    git checkout -b feature_branch
    # do stuff
    git commit
    git push origin feature_branch
    
    # PULL REQUEST VIA GITHUB

    Continued below  ↓.

    WORKFLOW PART 2


    This means there are merge conflicts.

    Update branch with rebase

    git checkout featureA
    # optionally, create backup branch:
    git branch featureA_backup
    
    git rebase master
    # resolve merge conflicts (git add and git rebase --continue)
    # retest and check functionality is still here!
    
    git push origin feature_branch --force
    

    WORKFLOW PART 3


    Squash N commits into 1
    git checkout featureA
    # optionally, create backup branch
    git branch featureA_backup
    
    git rebase -i HEAD~N  # N is number of commits to squash
    # this opens a file: replace all but the first pick with s
    # then opens the commit message (usually you can leave as is)
    
    git diff featureA_backup  # optionally, compare with the backup
    # this diff this should be blank (otherwise you've dropped a commit*)
    
    git push origin featureA --force  # this updates the PULL REQUEST
    

    *see below

    WORKFLOW TROUBLESHOOTING

    *Dropped a commit in a squash?
    # you took a backup right?
    git checkout featureA_backup # go to the branch 
    git branch -D featureA  # hard delete the unsuccessful branch
    git checkout -b featureA  # create branch with the same name
    # back where we started, try rebase -i again
    

    PULL-REQUEST

    Once you've  completed your feature/bug fix you should request your branch be included into upstream.

    ALL THE GOOD THINGS


    ALL THE GOOD THINGS


    • Show off what you've done
    • Everyone alerted to request, opportunity to review
    • Feedback from colleagues (e.g. code simplification, learn)
    • "Feedback" from CI ("Good to merge")
    • Give feedback to colleagues (it goes both ways)!
    • git checkout pr/1234 
    • Ensure your features aren't being removed!
    • Ensure no-one is doing anything stupid - including you!

    • Cross link actual commits/discussion with issue/bug-tracker 

    Isn't it slow?

    Obviously pushing directly upstream without review is faster in the short-term...

    • Down the line the new code will be read and re-read
    • This cycle is more direct if all are alerted to the changes
    • If you have a merge conflict you'll know why (I remember this pull request, I understand what I'm rebasing over)
    • Fewer bugs down the line

    • In the rebase model, merge commits are first indication something is amiss (See also the Easier to Catch Mistakes slide)... before they're in master.

    REBASE

    If there's been an update to upstream, you should "rebase" your commits from the "Old Base" to the new master.
    git rebase master

    Re-TEST

    After you've rebased you should check your feature is still passing (and the tests from commits you've just rebased over)

    Another layer to ensure that the merge conflicts were resolved correctly.
    Now you can force push your changes to your branch (see previous slide).

    WHAT?

    Eventually you’ll discover the Easter egg in Git: all meaningful operations can be expressed in terms of the rebase command. Once you figure that out it all makes sense. I thought the joke would be obvious: rebase, freebase, as in what was Linus smoking? -- Linus Torvalds

    PUSH WITH CARE


    • pre-push hook can stop accidental force pushes to master
    • no force pushing to an upstream (shared) branch

    USE THE FORCE (PUSH)

    Saying that, it's useful to force push your own branches  (after a rebase)... provided no-one is working of the top of them.
    git push --force origin featureA 
    In a general workflow this should be ok:
    • People are working off of master, not random branches*
    • Pull requests are kept up to date, and discussion resumes in the same place (no need for a new pull request)
    • No merge commits in pull requests (if you're rebasing)

    *If there's been an accident, and someone has been working off a branch which has been force-pushed branch then  git cherry-pick  is your friend.

    BEWARE: FORCE PUSH

    Never force push to upstream master

    • Easy to deleted recent commits and screw everything up
    • Recommend pre-push hook to reject force pushing upstream
    • (Enforce programatically rather than rely on habit.)

    INTERACTIVE REBASE

    Commits shouldn't be a back and forth of how you completing the , they should just record the completed feature. You can squash N commits into one using rebase:
    git rebase --interactive HEAD~N 
    Mark the commits you want to squash (or an s):

    finally tweak and  save the concatenated commit message.

    Don't record the entire journey, it's not how you got there, it's where you got that matters.




    REBASING VS MERGING


    MERGINg


    • Merge conflict is resolved in the crazy branch.
    • The merge conflict is actual work.

    Looking at the network

    One of the great things about Github is how easily you can review the status of the repository.
    • See who is doing what and when.
    • Drill down to individual commits. 

    Check it out on your project's Github page.

    REBASING VS MERGING

     

    VS

    Note: each coloured line is a branch.

    MERGING

    • Each dot is a commit OR a merge commit.
    • Merge commits may hide merge conflict resolutions.
    • Responsibility of merge conflict resolution is unclear.
    • Complicated to reason about diffs.
    • Merge commits hide actual work... was there a conflict?

    REASONS NOT TO MERGE

    • Incorrect merge-commits can go unnoticed* (work is being concealed in merge commits).
    • Unclear where, when and by whom work was done (at least from the network graph), difficult to distinguish chunks of work.
    • Unable to "squash" commits if they're littered with merges.




    * Is it really just a merge conflict resolution or has something been lost whilst resolving, when?

    DEVIL'S ADVOCATE:

    REASONS TO PREFER MERGING

    • It feels more familiar.
    • You have a history of everything (merge commits)*.
    • More commits, more satisfying.
    • Sometimes fewer conflicts to resolve (can skip over shared diff rather than each commit)**.


    * though this can be a "hidden history", if a merge conflict has been incorrectly resolved.
    ** in practice, if you are rebasing and have few commits in a branch then this shouldn't be an issue. 

    REBASING


    • Each dot is a commit.
    • Each arrow is a merge commit (without merge conflicts).
    • We can click and drill down to the diff.
    • Network history is "clean".
    • Responsibility of merge conflict resolution (was) transparent.

    EASIER TO REASON ABOUT

    • Everyone is starting at the same place (there wasn't a rogue commit three months ago included in Betty's branch).
    • Obvious where each merge commits are.
    • Easy to "squash" entire branches into a single commit**.



    * incorrect merge-commits (from a git merge) can go unnoticed.
    Is it really just a merge conflict resolution or has something been lost whilst resolving...
    ** git rebase -i HEAD~N  # where N is number of commits to squash

    ANNOYING THIngs ABOUT REBASE

    • If you haven't been updating often rebasing can be painful*.
    • Maintenance overhead in continuously updating branches.
    • Possible to delete your work**.
    • Can get tripped up if someone rebases something you've started working on already - so don't do that.

    BUT WORTH IT


    *solution is to rebase regularly... If you're hundreds of commits behind, consider squashing and doing a git cherry-pick, better to just not.
    **rather than other peoples, if you're doing something tricky create a backup branch first.




    SANDBOX


    REBASE: EASIER TO CATCH MISTAKES

    Significantly easier to spot mistakes in the process:

    • Merge commits appearing in pull requests.
    • Unusual activity in the network graph.
    • File changes which have nothing to do with the feature appear in the pull request (an incorrect merge resolution).

    If suspicious, then investigate... before it's in master.

    SANITY CHECK PULL REQUESTS

    1. No merge commits.
    2. Files/lines changed should make sense.
    3. Tests pass.



    Check it out and have a play if you like:
    git checkout pr/1234 


    PREDICTION: PAIN

    It's version control, someone *is* going to get hurt occasionally.
    Choosing a workflow is about minimising the pain.


    The best way to reduce pain is to heal before it's in production:
    The True Meaning Of Pain.

    SUMMARY

    1. Keep master ff-ed to upstream/master.
    2. Create a new branch (for each feature) from master.
    3. Make pull requests to upstream.
    4. Rebase branches to update.


    • Do not git pull.
    • Do not git merge.
    • Do not pass Go.




    THE END




    QUESTIONS?

    APPENDIX

    Some additional git set up:

    • Set up git aliases (e.g. git ff)
    • Show git branch in the terminal
    • Adding and changing remotes
    • Don't allow pushes to upstream remote.
    • Have upstream remote to disallow force push

    .GITCONFIG

    [color]
        ui = true
    [alias]
      r = rebase
      f = fetch
      ff = merge --ff-only
      co = checkout
      c = commit
      a = add
      d = diff
      p = push
      pull = pull --rebase  # but don't use pull
      s = status
      h = log --pretty=format:\"%h %ad | %s%d [%an]\" --graph --date=short
    po = push origin rum = rebase upstream/master fu = fetch upstream ffum = merge --ff-only upstream/master dum = !git diff $(git merge-base upstream/master HEAD) [remote "upstream"] fetch = +refs/heads/*:refs/remotes/upstream/* fetch = +refs/pull/*/head:refs/remotes/upstream/pr/*
    Most frequently used commands can be done with less typing... and add security. 
    For example, git ff  safter than git merge --ff-only  ( --ff-only is critical).

    SHOW GIT BRANCH ALWAYS


    I use an oh-my-zsh theme (terminal party):


    • Use a theme
    • or  set up by tweaking your terminals PS1 e.g. on ubuntu.

    That way you always know where you are before committing (are you in master, are you in the correct feature branch?)

    Looking at REMOTES

    So upstream is being synced to your repository, but where?

    Take a look with remote show:
    git remote show upstream
    • fetch and push urls
    • all branches on the remote
    • "tracked" branches

    Where "tracked" means the local branch means the remotes branch is "the same", usually these will have the same name. 

    ADDING A remote

    If you want to track Betty's repo, then just add it as a remote.
    git remote add betty https://github.com/bettys_username/project.git
    

    Now you can have a local backup of her repository.
    git fetch betty
    Change the url of the remote.
    git remote set-url origin https://github.com/your_username/project.git

    Note: this is a one-time set up, once you have your remotes set up, you shouldn't need to touch these again (one less thing).

    DIsallow pushing directly upstream

    An easy way to change this is to set the push url:
    git remote set-url --push origin no_push 

    Now you won't be able to push directly to upstream.




    If I'm worried about this I set up a second remote, called upstream_push, which I use to push upstream if I specifically want to, but make it less lightly to do so accidentally.

    FURTHER READING

    Git

    By andy hayden

    Git

    A little goes a long way, and the rebase workflow

    • 4,898