git

In case of fire
git commit -am "" & git push- Command line
- GUI for git
- What's hiding under "./.git" ?
- Pimp my git!
- git ecosystem
- Best practices
- A commit dissection
Our base

Command line
Our base
Rebase
git rebase masterRebase
Rebase
Interactive Rebase
git rebase -i master
pick 1 Start my branch1
pick 2 Continue my branch1
Interactive Rebase
git logbook (Reflog)
git reflog### Output
0f8dde4 HEAD@{0}: rebase -i (finish): returning to refs/heads/master
0f8dde4 HEAD@{1}: rebase -i (squash): first + third
606813c HEAD@{2}: rebase -i (start): checkout f07ec95
b8fed7c HEAD@{3}: commit: third
e3025b3 HEAD@{4}: commit: second
606813c HEAD@{5}: commit: first
f07ec95 HEAD@{6}: commit (initial): initialLet's try these
Our base

Let's move the head

git checkout masterTag

git tag "Add-great-lib" 1git à la cerise

git cherry-pick 1As Louis XVI...

git checkout --detach Cgit à la cerise (again)

git cherry-pick Add-great-libBack to master

git checkout masterNah, let's remove this lib

git reset --hard HEAD~1Oh wait, I really need it...
git reflog
# reflog output...
# Find the commit xxx to which we want to reset
git reset xxx
GUI for Git
SourceTree

source: https://www.sourcetreeapp.com/
Kraken

source: https://www.gitkraken.com/git-client
Tig
ncurses-based (GUI dans un terminal) text-mode interface

Source: https://www.atlassian.com/blog/git/git-tig
Pycharm, VSCode...
Pros
- Easy to use
- No commands to remember
- Visual and pretty representation of git history
- some commands (such as cherry-pick) can be easier to do
Cons
- You don't know exactly what's happening.
- You usually need to create an account of the platform.
- Errors can be confusing.
- git processes may conflict if multiple tools are running concurrently.
What's hiding under "./.git" ?
COMMIT_EDITMSG
Files
Last commit message.
config
Override git config for this repository.
description
Description of git repository, used by Gitweb.
FETCH_HEAD
Result of latest git fetch.
HEAD
The "you are here" of git, always linked to a git reference.
index
"staging area" content.
ORIG_HEAD
Updated while a merge or rebase is in progress.
Hash of the parent commit for the next operation.
packed-refs
Update by git garbage collection.
Stored the "sleeping" refs. (see here)
MERGE_HEAD
MERGE_MODE
Files used in merge. They are created at merge start and deleted once merge is over.
MERGE_MSG
Hash of last commit of this branch
Flag used to know which branch to merge when using git pull
ae4cc3ca... branch 'develop' of xxxx
face39bb... not-for-merge branch 'bugfix/fix-display-of-no-data-component' of xxxxBranch name on remote and address of remote.
File: FETCH_HEAD
Directories
hooks
User scripts run when specific event (before-commit, fetch...) are recognized.
info
Additional information for this repository.
Contains an exclude file which acts as a non shared local .gitignore
Directories
logs
git logbook, used by git reflog
objects
git core database (see git-cat-file)
rebase-apply
Working space used by git while running git rebase and git-am.
refs
references (branches, tags, stashes) most often accessed in this repository.
Can be automatically archived in packed-refs file.
To go further
Pimp My Git!
Aliases
[alias]
co = checkout
st = status --short --branch
mr = !sh -c 'git fetch $1 merge-requests/$2/head:mr-$1-$2 && git checkout mr-$1-$2' -
out = log @{u}..
in = log ..@{u}- pre-commit
- prepare-commit-msg
- post-checkout
- ...
Hook

Hooks
pre-commit
....
- repo: https://github.com/PyCQA/isort
rev: 5.10.1
hooks:
- id: isort
language_version: python3
args: ['--profile', 'black']Pros:
- Language agnostic.
- hooks are versioned within git history.
- Shared by all team member.
- Easier to write and reuse than bash scripts.
- It automatizes many things: linting, test running, xml/yaml checks, doc generation...
My own git plugin
#!/bin/zsh
c=$(git branch | grep -c "$1")
if [[ "$c" -eq 1 ]]; then
git checkout $(git branch | grep "$1")
else
echo "'$1' is ambigus"
git branch | grep "$1"
exit 1
fiExisting Plugins
- git-svn
- octopussy
- git-git
- many more...
Exclude local file
(icclim) icclim % cat .git/info/exclude
# git ls-files --others --exclude-from=.git/info/exclude
# Lines that start with '#' are comments.
# For a project mostly in C, the following would be a good set of
# exclude patterns (uncomment them if you want to use them):
# *.[oa]
# *~
*-abel-*
*.ncExclude local file
Useful to exclude files such as:
- todo_list-abel-.md
- test_script-abel-.md
- big_cmip6_tasmax_file.nc
Or any files related to the project but which should never be committed and shared.
Store and versionize large files
Large files should not be committed as is within source code repository:
- Files are never truly deleted from git database unless rewriting git history (with rebase).
- Large files render git clones and git garbage collection slow
- git is not a cloud storage for files
Store and versionize large files
Solution: git Large File Storage (LFS)
Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise.
reference: https://git-lfs.github.com/
Store and versionize large files
Cons of git (LFS)
- Every team member must individually setup git LFS for it to work properly
- If you disable git LFS, the whole history will be affected
Git ecosystem
Repository management services
Github, Gitlab, Bitbucket, Gogs, Coding...
Pros
- Stockage des source centralisé
- Gestion des utilisateurs
- Pull request
- Fork
Continuous integration
Github Actions, Jenkins, Azure devops, Gitlab-CI...
Best practices
Warning, this section include personal opinions
- Each commit should be about one theme only
- If some linting is necessary, commit all linting at once.
- If working on a new feature, you can divide it in multiple independent commit.
- If updating some configuration, commit each file independently.
- Make use of git commit --patch
- A commit should "work" as is and be tested.
How to make a proper commit
Files
- Commit title should start with a theme (e.g ENH, DOC, FIX...)
- Commit title should complete the sentence "This commit ..."
- Commit title should always be followed by a blank line before the commit description
- Commit title should be around ~50 characters
- Commit should explain "why" and not that much "how".
- Commit description can be as long as necessary and each line should be at shorter than 100 characters
- At least one commit of a branch should include a reference to the opened issue (e.g #100)
How to make a proper commit
Message
- Code-review
- Knowledge is shared
- Code quality improves with discussions
- Transparency on the new code base
- Conflicts resolution
- Broader view of the history than each individual commits
Pull requests are really good, use them
Solving conflicts
- The best conflict resolution is to avoid conflicts:
- Delay structural refactoring when a lot of changes are in progress.
- Introduce coding convention and automatic linting tools to avoid unnecessary conflicts.
- Merge (or rebase) the shared branch (master/main) with your changes as often as possible to avoid handling all conflicts at once.
- If there are conflicts:
- Adopt a "feature first" approach to re-do refactoring on new features instead of re-do features on refactored code
- Go step by step on each conflicts and ask each dev involved whenever there are doubts
Hidden conflicts
Once all conflicts are solved run all test and make sure everything work.
For example, if a refactoring deletes a file which is not modified but used by a merged feature, git will not be able to see the issue of deleting this file.
- Widely used in corporation but rarely in open-source
- Very rigorous, but too complex for small teams
- It uses branches master, develop, release/..., fix/..., hotfix/...
Branching models
Gitflow
- Simple and effective
- Getting used more and more, especially in open source projects
- It relies on
- A main branch (main or master).
- Branches based and merged to this main branch.
- Tags to help reading the history.
- Scale from a one person project to a thousands of developers library (e.g Numpy)
Branching models
Githubflow
- Rebase keep a clean history for future maintainers
- Rebase allows to rework each commit
- Merge is simple, and minimize conflicts handling.
- Merge preserve the existing history
Branching models
merge vs rebase
Don't
- Rebase a pushed and shared branch (main, release...)
- Rebase too many commits whith many conflicts
- Merge while running a rebase
Branching models
merge vs rebase
Do
- Rebase to rework and keep your own branch clean
- Force push your own branch once rebase is done
- Otherwise you could rebase "to infinity" with you own work
- Follow the repository guidelines and contribution guides
Branching models
merge vs rebase
A commit dissection
git is magic!

Avez vous déjà vu ...
A commit from Michel ?
commit 8684d0560cb7c51234cbefebd83409e90cb7e29f
Author: Michel michel.xxxx@viseo.com
Date: Wed Dec 6 12:40:14 2017 +0100
"Init presentation"git hash
algorithme sha-1


Hashing of a commit
sha1(
commit_message,
commiter,
commit_date,
author,
authoring_date,
Hash-Of-Entire-Working-Directory
)Hash-Of-Entire-Working-Directory ?

Example: myProjet
What we create and see
.
│ main.java
│
├───.git
└───resources
│ logo.png
└───components
...
Example: myProjet
What git sees

Let's hash our commit
sha1(
"Init presentation",
"michel.xxxx@viseo.com",
Wed Dec 6 12:40:14 2017 +0100,
"michel.xxxx@viseo.com",
Wed Dec 6 12:40:14 2017 +0100,
aa1b2fb696a831c89c53f787e03d863691d2b671
)What's the point of hashing?
Preserve the integrity of our data
Parent commit hash
sha1(
meta data
commit_message
commiter
commit_date
author
authoring_date
Hash-Of-Entire-Working-Directory
hash-of-parent-commit
)Each commit but the very first has a parent

To go further : git objects
A few more useful commands
Cherry pick multiple commits
git cherry-pick cool-feature~3..cool-featureGit rerere
"reuse recorded resolution"
git config --global rerere.enabled trueGit worktree
Avoid cloning twice one repository
git worktree add ../myproject_2nd_wt- It creates a worktree and a branch with the name myproject_2nd_wt
- It behaves like a second clone of the repository
- Worktrees are managed with git wortree commands
Git worktree
Pros
- You can work on two separate features "at the same time"
- It uses the same ./git directory thus all your configs are exported to the new worktree
- It uses the same git database:
- All branches are synchronized
- There is no need to download multiple times a "heavy" git repository.
Git worktree
Cons
- Tools such as SourceTree do not handle worktree very well
- If excluded files are needed in multiple worktree, they must be manually handled (copied or referenced)
Licence
Pour ce qui concerne tous les contenus que nous avons produits dans cette présentation :
licence Creative Commons Attribution - Pas d’Utilisation Commerciale 4.0 International
.
Fondé(e) sur une œuvre à https://github.com/barmic/viseo-take-an-hour_cassandra.git.
GIT advanced training
By bzah
GIT advanced training
- 266