git

 

In case of fire

git commit -am "" & git push
  • Command line
  • GUI for git
  • What's hiding under "./.git" ?
  • Pimp my git!
  • git ecosystem
  • Best practices
  • A commit dissection

Our base

Command line

Our base

Rebase

git rebase master

Rebase

Rebase

Interactive Rebase

 

git rebase -i master

pick 1 Start my branch1
pick 2 Continue my branch1

Interactive Rebase

 

git logbook (Reflog)

git reflog
### Output
0f8dde4 HEAD@{0}: rebase -i (finish): returning to refs/heads/master
0f8dde4 HEAD@{1}: rebase -i (squash): first + third
606813c HEAD@{2}: rebase -i (start): checkout f07ec95
b8fed7c HEAD@{3}: commit: third
e3025b3 HEAD@{4}: commit: second
606813c HEAD@{5}: commit: first
f07ec95 HEAD@{6}: commit (initial): initial

Let's try these

Our base

Let's move the head

git checkout master

Tag

git tag "Add-great-lib" 1

git à la cerise

git cherry-pick 1

As Louis XVI...

git checkout --detach C

git à la cerise (again)

git cherry-pick Add-great-lib

Back to master

git checkout master

Nah, let's remove this lib

git reset --hard HEAD~1

Oh wait, I really need it...

git reflog
# reflog output...
# Find the commit xxx to which we want to reset

git reset xxx

GUI for Git

SourceTree

source: https://www.sourcetreeapp.com/

Kraken

source: https://www.gitkraken.com/git-client

Tig

ncurses-based (GUI dans un terminal) text-mode interface

Source: https://www.atlassian.com/blog/git/git-tig

Pycharm, VSCode...

Pros

 

  • Easy to use
  • No commands to remember
  • Visual and pretty representation of git history
  • some commands (such as cherry-pick) can be easier to do

Cons

  • You don't know exactly what's happening.
  • You usually need to create an account of the platform.
  • Errors can be confusing.
  • git processes may conflict if multiple tools are running concurrently.

What's hiding under "./.git" ?

COMMIT_EDITMSG

Files

Last commit message.

config

Override git config for this repository.

description

Description of git repository, used by Gitweb.

FETCH_HEAD

Result of latest git fetch.

HEAD

The "you are here" of git, always linked to a git reference.

index

"staging area" content.

ORIG_HEAD

Updated while a merge or rebase is in progress.

Hash of the parent commit for the next operation.

packed-refs

Update by git garbage collection.

Stored the "sleeping" refs. (see here)

MERGE_HEAD

MERGE_MODE

Files used in merge. They are created at merge start and deleted once merge is over.

MERGE_MSG

Hash of last commit of this branch

Flag used to know which branch to merge when using git pull

ae4cc3ca...                branch 'develop' of xxxx
face39bb...  not-for-merge branch 'bugfix/fix-display-of-no-data-component' of xxxx

Branch name on remote and address of remote.

File: FETCH_HEAD

Directories

hooks

User scripts run when specific event (before-commit, fetch...) are recognized.

info

Additional information for this repository.

Contains an exclude file which acts as a non shared local .gitignore

Directories

logs

git logbook, used by git reflog

objects

git core database (see git-cat-file)

rebase-apply

Working space used by git while running git rebase and git-am.

refs

references (branches, tags, stashes) most often accessed in this repository.

Can be automatically archived in  packed-refs file.

Pimp My Git!

Aliases

[alias]
  co = checkout							
  st = status --short --branch
  mr = !sh -c 'git fetch $1 merge-requests/$2/head:mr-$1-$2 && git checkout mr-$1-$2' -
  out = log @{u}..
  in  = log ..@{u}
  • pre-commit
  • prepare-commit-msg
  • post-checkout
  • ...

Hook

Hooks

pre-commit

....
-   repo: https://github.com/PyCQA/isort
    rev: 5.10.1
    hooks:
    -   id: isort
        language_version: python3
        args: ['--profile', 'black']

Pros:

  • Language agnostic.
  • hooks are versioned within git history.
  • Shared by all team member.
  • Easier to write and reuse than bash scripts.
  • It automatizes many things: linting, test running, xml/yaml checks, doc generation...

My own git plugin

#!/bin/zsh
								
	c=$(git branch | grep -c "$1")
								
	if [[ "$c" -eq 1 ]]; then
		git checkout $(git branch | grep "$1")
	else
		echo "'$1' is ambigus"
		git branch | grep "$1"
		exit 1
	fi

Existing Plugins

  • git-svn
  • octopussy
  • git-git
  • many more...

Exclude local file

(icclim) icclim % cat .git/info/exclude 
# git ls-files --others --exclude-from=.git/info/exclude
# Lines that start with '#' are comments.
# For a project mostly in C, the following would be a good set of
# exclude patterns (uncomment them if you want to use them):
# *.[oa]
# *~

*-abel-*
*.nc

Exclude local file

Useful to exclude files such as:

  • todo_list-abel-.md
  • test_script-abel-.md
  • big_cmip6_tasmax_file.nc

Or any files related to the project but which should never be committed and shared.

Store and versionize large files

Large files should not be committed as is within source code repository:

  • Files are never truly deleted from git database unless rewriting git history (with rebase).
  • Large files render git clones and git garbage collection slow
  • git is not a cloud storage for files

Store and versionize large files

Solution: git Large File Storage (LFS)

Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise.

reference: https://git-lfs.github.com/

Store and versionize large files

Cons of git (LFS)

  • Every team member must individually setup git LFS for it to work properly
  • If you disable git LFS, the whole history will be affected

Git ecosystem

Repository management services

Github, Gitlab, Bitbucket, Gogs, Coding...

Pros

  • Stockage des source centralisé
  • Gestion des utilisateurs
  • Pull request
  • Fork

Continuous integration

Github Actions, Jenkins, Azure devops, Gitlab-CI...

Best practices

Warning, this section include personal opinions

  • Each commit should be about one theme only
    • If some linting is necessary, commit all linting at once.
    • If working on a new feature, you can divide it in multiple independent commit.
    • If updating some configuration, commit each file independently.
    • Make use of git commit --patch
  • A commit should "work" as is and be tested.

How to make a proper commit

Files

  • Commit title should start with a theme (e.g ENH, DOC, FIX...)
  • Commit title should complete the sentence "This commit ..."
  • Commit title should always be followed by a blank line before the commit description
  • Commit title should be around ~50 characters
  • Commit should explain "why" and not that much "how".
  • Commit description can be as long as necessary and each line should be at shorter than 100 characters
  • At least one commit of a branch should include a reference to the opened issue (e.g #100)

How to make a proper commit

Message

  • Code-review
    • Knowledge is shared
    • Code quality improves with discussions
    • Transparency on the new code base
  • Conflicts resolution
  • Broader view of the history than each individual commits

Pull requests are really good, use them

Solving conflicts

  • The best conflict resolution is to avoid conflicts:
    • Delay structural refactoring when a lot of changes are in progress.
    • Introduce coding convention and automatic linting tools to avoid unnecessary conflicts.
    • Merge (or rebase) the shared branch (master/main) with your changes as often as possible to avoid handling all conflicts at once.
  • If there are conflicts:
    • Adopt a "feature first" approach to re-do refactoring on new features instead of re-do features on refactored code
    • Go step by step on each conflicts and ask each dev involved whenever there are doubts

Hidden conflicts

Once all conflicts are solved run all test and make sure everything work.

For example, if a refactoring deletes a file which is not modified but used by a merged feature, git will not be able to see the issue of deleting this file.

  • Widely used in corporation but rarely in open-source
  • Very rigorous, but too complex for small teams
  • It uses branches master, develop, release/..., fix/..., hotfix/...

Branching models

Gitflow

  • Simple and effective
  • Getting used more and more, especially in open source projects
  • It relies on
    • A main branch (main or master).
    • Branches based and merged to this main branch.
    • Tags to help reading the history.
  • Scale from a one person project to a thousands of developers library (e.g Numpy)

Branching models

Githubflow

  • Rebase keep a clean history for future maintainers
  • Rebase allows to rework each commit
  • Merge is simple, and minimize conflicts handling.
  • Merge preserve the existing history

Branching models

merge vs rebase

Don't

  • Rebase a pushed and shared branch (main, release...)
  • Rebase too many commits whith many conflicts
  • Merge while running a rebase

Branching models

merge vs rebase

Do

  • Rebase to rework and keep your own branch clean
  • Force push your own branch once rebase is done
    • Otherwise you could rebase "to infinity" with you own work
  • Follow the repository guidelines and contribution guides

Branching models

merge vs rebase

A commit dissection

git is magic!

 

Avez vous déjà vu ...

A commit from Michel ?

commit 8684d0560cb7c51234cbefebd83409e90cb7e29f
 Author: Michel michel.xxxx@viseo.com
 Date:   Wed Dec 6 12:40:14 2017 +0100

"Init presentation"

git hash

algorithme sha-1

hash

Hashing of a commit

sha1(
  commit_message,
  commiter,
  commit_date,
  author,
  authoring_date,
  Hash-Of-Entire-Working-Directory
)

Hash-Of-Entire-Working-Directory ?

Example: myProjet

What we create and see

 .
 │ main.java
 │
 ├───.git
 └───resources
 │     logo.png
 └───components
       ...

Example: myProjet

What git sees

Let's hash our commit

				
sha1(
  "Init presentation",
  "michel.xxxx@viseo.com",
  Wed Dec 6 12:40:14 2017 +0100,
  "michel.xxxx@viseo.com",
  Wed Dec 6 12:40:14 2017 +0100,
  aa1b2fb696a831c89c53f787e03d863691d2b671
)

What's the point of hashing?

Preserve the integrity of our data

 

Parent commit hash

sha1(
 meta data
 commit_message
 commiter
 commit_date
 author
 authoring_date
 Hash-Of-Entire-Working-Directory
 hash-of-parent-commit
)

Each commit but the very first has a parent

To go further : git objects

A few more useful commands

Cherry pick multiple commits

git cherry-pick cool-feature~3..cool-feature

Git rerere

"reuse recorded resolution"

git config --global rerere.enabled true

Git worktree

Avoid cloning twice one repository

git worktree add ../myproject_2nd_wt
  • It creates a worktree and a branch with the name myproject_2nd_wt
  • It behaves like a second clone of the repository
  • Worktrees are managed with git wortree commands

Git worktree

Pros

  • You can work on two separate features "at the same time"
  • It uses the same ./git directory thus all your configs are exported to the new worktree
  • It uses the same git database:
    • All branches are synchronized
    • There is no need to download multiple times a "heavy" git repository.

Git worktree

Cons

  • Tools such as SourceTree do not handle worktree very well
  • If excluded files are needed in multiple worktree, they must be manually handled (copied or referenced)

 

Licence

Pour ce qui concerne tous les contenus que nous avons produits dans cette présentation : Licence Creative Commons licence Creative Commons Attribution - Pas d’Utilisation Commerciale 4.0 International .
Fondé(e) sur une œuvre à https://github.com/barmic/viseo-take-an-hour_cassandra.git.

GIT advanced training

By bzah

GIT advanced training

  • 266