Vladislav Shpilevoy PRO
Backend C++ developer at VirtualMinds. Database C developer at Tarantool.
An Efficient Git Workflow For High-Stakes Projects
Vladislav Shpilevoy
My context
Atomic Flow
Benchmarks
Conclusion
Enterprise Kingdom
Startup Camp
Product Tavern
C/C++: leaks, corruptions, thread races
Code size: 100k lines and more
Code age: 10 years and older
High load, high stakes
Can not keep knowledge only in people. They leave.
Must rely on code and its story. Keep them clean.
int getDiscount(Customer type, uint total)
{
if (type == Type::VIP)
return 20;
if (total > 1000)
return 15;
if (type == Type::NEW)
return 10;
return 0;
}
Latest commit
commit d4e5f6g7
Author: Vlad Shp <example@dev.com>
price: Discount for new customers
Give them 10% as an entry bonus.
Diff
@@ int getDiscount
if (total > 1000)
return 15;
+ if (type == Type::NEW)
+ return 10;
return 0;
int getDiscount(Customer type, uint total)
{
if (type == Type::VIP)
return 20;
if (total > 1000)
return 15;
if (type == Type::NEW)
return 10;
return 0;
}
commit d4e5f6g7
Author: Vlad Shp <example@dev.com>
price: Discount for new customers
Give them 10% as an entry bonus.
@@ int getDiscount
if (total > 1000)
return 15;
+ if (type == Type::NEW)
+ return 10;
return 0;
Reason
Scope
Instantly see "why"
Easy grasp what
Self-sufficient, self-explaining, brings value, nothing can be removed. Like functions.
relay: fix crash during shut down
In relay there's a message, which travels between relay and tx
threads and which is used to deliver vclock of the relay to tx
thread.
However, when the message is in tx thread, it may happen, that
relay thread dies and tx may try to send message back without
checking, whether the thread is alive.
Closes #9920
---
src/box/relay.cc | 10 ++++
src/lib/core/errinj.h | 1 +
test/test.lua | 57 +++++++++++++++++++
Rich description of "why"
Atomic change + test
fix: don't cpipe_push into a closed pipe in tx_status_update
---
src/box/relay.cc | 15 ++++++++++-----
Narrates the change, no "why"
No test and fix is overkill
60272e - box: make 'double' an alias for 'number'
0f200a - tuple: fix bad hashing of suboptimal int msgpack
995e38 - box: drop tuple hash type templates
09d116 - tuple: make tuple_bloom versioned
199be6 - tuple: fix confusing hashing of 'number' index
09fd38 - perf: introduce tuple hash bench
d82ba5 - perf: fix wrong key_part_def initialization
d83faf - perf: drop singletons from tuple bench
5e922a - perf: fix potential stack overflow
ff9acc - perf: have explicit main() in tuple bench
d38274 - perf: use more reliable random in tuple bench
8a122c - datetime: fix suboptimal msgpack decoding
b0bf27 - vinyl: fix suboptimal msgpack assert in upsert
d024fa - memtx: fix suboptimal msgpack assert in hints
a13fd2 - Bump msgpuck submodule
15 small commits
7 places to fix
8 preparations
Average commit ~50 lines*
Each has a test
xrow: add missing break statement
xrow: add missing break statement
The old (7.5.0) GCC compiler leaves the following compilation warning
(leading to the error with `-Werror`):
NO_WRAP
```
src/box/xrow.c: In function ‘xrow_decode_call’:
src/box/xrow.c:1616:31: error: this statement may fall through [-Werror=implicit-fallthrough=]
request->tuple_formats_end = data;
~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
src/box/xrow.c:1617:3: note: here
default:
^~~~~~~
cc1: all warnings being treated as errors
```
NO_WRAP
The missing statement is harmless since the `default` just does the same
thing, but to fix the error for old compilers and make the code cleaner,
the missing statement is added in this patch.
NO_DOC=cleanup
NO_TEST=cleanup
NO_CHANGELOG=cleanup
---
src/box/xrow.c | 1 +
1-line change
21-line explanation
Prepare public API
Extract a function
Rename classes
Add a benchmark
Add feature and tests
Expose new API to public
Preparatory commits. Easy and fast to review
Functional commits. Pure, without unrelated diff-noise
box: make 'double' an alias for 'number'
tuple: fix bad hashing of suboptimal int msgpack
box: drop tuple hash type templates
tuple: make tuple_bloom versioned
tuple: fix confusing hashing of 'number' index
perf: introduce tuple hash bench
perf: fix wrong key_part_def initialization
perf: drop singletons from tuple bench
perf: fix potential stack overflow
perf: have explicit main() in tuple bench
perf: use more reliable random in tuple bench
datetime: fix suboptimal msgpack decoding
vinyl: fix suboptimal msgpack assert in upsert
memtx: fix suboptimal msgpack assert in hints
Bump msgpuck submodule
Small atomic commits with tests and explanation
- alias double to number
- clang-format apply
- support for double
- msgpack fix
- bench fix
- review fixes
- bench
- address MR comments
- fix msgpack bugs
Unclear messages
Non-atomic changes like "review fix"
2 x Fix blocking bugs
11 x Prepare the code
1 x Feature part-1
6 x Prepare the code
5 x Feature part-2
25 commits
358 lines - the biggest commit
2400 lines total*
Preparation part-1
Feature
CD new deploy
CI checking
Preparation part-2
6 commits
6 lines - the feature
2800 lines total*
Speed up billing
Reuse code in HTTP
New reporting
SSL support
Memory optimization
Coroutines support
Add logs
Explicit "happens before" relations between commits
➡️ Easy to understand
Can checkout the whole repository at any older commit
➡️ Deep investigations
Each commits is transferable and revertible
➡️ Trivial cherry-pick and revert
master
feature ☑️
Based on latest master
master
feature ⚠️
Has an outdated base
❌
Can't merge
master
> merge-commit
> tombstone-commits
Actual changes
Placeholders in history
master
> merge-commit
Actual changes + conflict fixes
> conflict fixes
⚠️
Need to fix all merge conflicts of the branch at once
⚠️ Large context
Commit atomicity is lost
⚠️ Dependency on the merge-commit
Merging right into master is unsafe
⚠️ Need to merge in both directions for CI
master
feature ⚠️
The source commits are joined into one
master
> merge-commit
The source commits are joined into one
☑️ Simple
☑️ Linear history
⚠️ Low atomicity
Literally change the branch's base
master
feature ⚠️
Literally change the branch's base
master
feature ☑️
Literally change the branch's base
master
☑️ Simple
☑️ Linear history
☑️ High atomicity
☑️ Small scope of conflicts
| | X | | | | | | | | | | | | | | | | | | | | | | | 66af85f1c
| | X | | | | | | | | | | | | | | | | | | | | | | | e17bdad3f
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | |/ / / / / / / / / / / / / / / / / / / / / / / /
| |/| | | | | | | | | | | | | | | | | | | | | | | |
| X | | | | | | | | | | | | | | | | | | | | | | | | be1ba6a1a
| X | | | | | | | | | | | | | | | | | | | | | | | | f4e493fc8
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | X | | | | | | | | | | | | | | | | | | | | | | | | 52303d151
| | | |/ / / / / / / / / / / / / / / / / / / / / / /
| | |/| | | | | | | | | | | | | | | | | | | | | | |
| | | X | | | | | | | | | | | | | | | | | | | | | | 39c44e780
| | | X | | | | | | | | | | | | | | | | | | | | | | 6935f3035
| | | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | |_|/ / / / / / / / / / / / / / / / / / / / / / /
| |/| | | | | | | | | | | | | | | | | | | | | | | |
| X | | | | | | | | | | | | | | | | | | | | | | | | 416aed6a4
| |/ / / / / / / / / / / / / / / / / / / / / / / /
| X | | | | | | | | | | | | | | | | | | | | | | | cb3e8ee19
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | X | | | | | | | | | | | | | | | | | | | | | | | 60eeae3c8
| | X | | | | | | | | | | | | | | | | | | | | | | | def206020
| X | | | | | | | | | | | | | | | | | | | | | | | | 5eaa330aa
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| |/ / / / / / / / / / / / / / / / / / / / / / / / /
|/| / / / / / / / / / / / / / / / / / / / / / / / /
| |/ / / / / / / / / / / / / / / / / / / / / / / /
X | | | | | | | | | | | | | | | | | | | | | | | | 662ece335
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| X | | | | | | | | | | | | | | | | | | | | | | | | 2ccfcaf55
|/ / / / / / / / / / / / / / / / / / / / / / / / /
X | | | | | | | | | | | | | | | | | | | | | | | | 316d2b85c
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
X \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 075e00a81
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| |_|_|_|/ / / / / / / / / / / / / / / / / / / / / /
|/| | | | | | | | | | | | | | | | | | | | | | | | |
| | | X | | | | | | | | | | | | | | | | | | | | | | 47b8749e6
| | | X | | | | | | | | | | | | | | | | | | | | | | 627de687
| | | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | X | | | | | | | | | | | | | | | | | | | | | | 32abf8bee
| | | | X | | | | | | | | | | | | | | | | | | | | | | ce216e2f4
| | | | |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | |/ / / / / / / / / / / / / / / / / / / / / / /
| | | |/| | | | | | | | | | | | | | | | | | | | | | |
| | | X | | | | | | | | | | | | | | | | | | | | | | | 910cbb25b
The lines - merge-commit tails with tombstone commits 😭
X | 51b21786b
X | 01d85bb09
X | fad12311a
X | 6e27b80bb
X | 952caf783
X | 7e557dfad
X | a7a58eb04
X | 1581e6f0d
X | 2b0d16176
X | 3e4f1e940
X | 810c67f72
X | d19cd9c7d
X | acd97a083
X | faf760e0e
X | 3699e6e8b
X | 82a1e50ca
|\|
| X 44a8f3702
X | 3752f10ab
X | 05760e0c0
X | e77060027
|\|
| X 69b147117
| |\
X | | 51c1cd782
X | | fecd8c691
X | | 801656ba3
X | | f3a981cc5
X | | 87efbe896
X | | 6327a904c
X | | 802116277
X | | de513ca3c
X | | a98cf6de7
X | | 5e5f8d953
| |/
|/|
X | 60bcb60c1
X | c1236048c
Maximum 2 lines
Tarantool
Database
git-blame
and git-bisect
in <investigations>
Virtual Minds
AdTech
git-blame
to investigate old codegit-bisect
+ git-revert
just recently for:
Ubisoft
GameDev
master
feature
> wip part 1
> part 1 tests
> wip part 2
> formatting
> fix for part 1
> part 2 tests
Commits on a dev branch are rarely atomic from the start
master
feature
> wip part 1
> part 1 tests
> wip part 2
> formatting
> fix for part 1
> part 2 tests
But they can be reordered ...
master
feature
> Feature Part 1
> Feature part 2
..., edited, and squashed
git rebase -i
feature
master
git rebase -i <base hash>
pick ca04a4d part 2 tests
> base
Your text editor
pick 3625929 formatting
pick b74900a wip part 2
pick fc54a10 fix for part 1
pick 6c8faf1 part 1 tests
pick 37857cc wip part 1
> wip part 1
> part 1 tests
> wip part 2
> formatting
> fix for part 1
> part 2 tests
feature
master
git rebase -i <base hash>
> base
Your text editor
pick ca04a4d part 2 tests
pick 3625929 formatting
pick b74900a wip part 2
pick fc54a10 fix for part 1
pick 6c8faf1 part 1 tests
pick 37857cc wip part 1
> wip part 1
> part 1 tests
> wip part 2
> formatting
> fix for part 1
> part 2 tests
feature
master
git rebase -i <base hash>
> base
Your text editor
fixup ca04a4d part 2 tests
fixup 3625929 formatting
reword b74900a wip part 2
fixup fc54a10 fix for part 1
fixup 6c8faf1 part 1 tests
reword 37857cc wip part 1
> wip part 1
> part 1 tests
> wip part 2
> formatting
> fix for part 1
> part 2 tests
feature
master
git rebase -i <base hash>
> base
Your text editor
> Feature Part 1
> Feature Part 2
Save
Tutorial on rebasing by the <link>
23 people: from juniors to CTOs
10 open questions
About atomicity, merging, rebasing, usefulness and harm of certain practices
"Atomic Flow" takes ~10% of dev time.
But < 30 mins per task.
✅ When I first started making commits atomic, it took me 50% of my dev time. At this moment it takes me about 10% of my dev time.
⚠️ I usually don't keep my commits atomic, because it requires too much time, and sometimes there are commits like "wip", "without tests" etc.
✅ It doesn't take time, that's how I always work. Maybe 1%.
✅ Usually it is easy enough. Maybe about 10% of dev time is spent on rebasing and reorganizing commits, but there are many advantages.
⚠️ From zero for a simple bug fix, to 50%+ for a full-fledged feature, that exposes number of cross-component dependencies.
✅ A huge impact on the maintainability of the product: track reasoning, bisecting, reverting etc.
⚠️ I think it brings a value in some environments. It is inconclusive to me.
✅ Structured change sets are a godsend to reviewers. It's particularly valuable for large refactoring changes, because it's easier to rule out accidental behavior changes or even introducing bugs as the code mutates.
⚠️ Commit granularity is important, but sometimes I feel like the project rules push my work back -- instead of simply merging something, I spend time working on commit messages and code style
AF - 10% / 30min
overhead
"Atomic Flow" boosts the project.
Faster and better reviews and investigations.
✅ I follow this approach, especially in my open source projects, and encourage PR authors / ask my team to follow it
⚠️ Does not matter in general, for large orgs with legacy code I believe it is not possible.
✅ Atomicity matters in large and long project, where employees come and go, and need a structured clear dev history + rules to quickly onboard new devs
⚠️ I don't care about atomicity, because no rule on that in the team
AF - 10% / 30min
overhead
"Atomic Flow" is high value for "Product" project.
Low value for "Enterprise" and "Startup".
AF - boost project
dev
⚠️ My experience - pilot projects. Git is just a tracking instrument. 1 commit = 1 hypothesis.
✅ That's about ability to forget yourself and try to read your changes like somebody else, take the reader's seat.
✅ Linear atomic commit history definitely provided a better experience for code review overall. It helps to complete review faster making it easier to understand.
⚠️ I think the main rule is that a commit mustn't be too large. It doesn't matter much if it's atomic or not.
✅ I think my review speed and quality increase about 5x when the commit change exibits atomicity and linearity.
⚠️ I don't know
AF - 10% / 30min
overhead
"Atomic Flow" review is quantified to be faster 1.5 to 5 times
Due to how easy it is to read the patches.
AF - boost project
dev
⚠️ Does not matter for me much, number of unrelated changes and the overall size of a change do matter
✅ If changes are structurized in some way, it has much more chances to bring me understanding without additional communication over chat, calls and so on.
AF - high value for product
⚠️ I always look at the full diff only. If there are commits like "lint fix", I think they should be squashed at merge
✅ I prefer a list of atomic commits and review them one by one
AF - 10% / 30min
overhead
Half devs look at whole diff
, half - at each commit when possible.
Some do both.
AF - boost project
dev
AF - high value for product
AF - faster review x1.5-5 times
✅ Atomic commits make reviews easier, faster, and more effective by providing clear, logical steps that build upon each other. Each commit tells a story: which makes reviews more intuitive. Commits like "fix typo", "fix review comments" e.g. spams history. Squashing these keeps things clean.
⚠️ Even when I review 'one by one' I still need to have a full picture - so I usually have an overview of the whole diff
⚠️ Default git flow without strict rules about merging and rebasing
✅ Jira-task ID in commit messages, small and logically split commits. Pre-commit hooks for lint, format, and tests.
AF - 10% / 30min
overhead
Linear history is better in every sense. People choose it when have a choice.
Varying in details like if merge-commits are needed.
AF - boost project
dev
AF - high value for product
AF - faster review x1.5-5 times
Reviews approaches vary a lot
✅ Always rebase, split into individual commits, support always-green master, 2 reviewers at least (to make it between three of you).
✅ I still (even with 1 long-lived branch) prefer to use the strict atomic commit structure. I believe it helps for maintenance and code support.
⚠️ I’d lean towards a Git Flow-inspired workflow, emphasizing on clean commit history and minimal merge commits.
⚠️ I usually go with relaxed approach - a dedicated branch for a task/issue with mandatory squash merge. So inside the branch anything goes, but after merging the PR it's just a single commit.
⚠️ Never
✅ git blame is integrated in my IDE, I used it every day.
AF - 10% / 30min
overhead
git-blame
and git-bisect
are used often.
Many use it daily or weekly.
AF - boost project
dev
AF - high value for product
AF - faster review x1.5-5 times
Reviews approaches vary a lot
Linear history is always nice
✅ Git blame — every day! It helps me understand who made a change and why. I also use git bisect (not daily, but I like it — it has helped me find a broken commit more than once)
✅ Of course I use git blame and git bisect extensively, they have helped me a ton of times.
⚠️ Bisect is rarely useful in complex projects and large monorepos
⚠️ I don't use them very often.
⚠️ Sometimes they are helpful, cause I can find specific discussions, pipelines etc
✅ They are not useful. They just add noise and impair readability.
AF - 10% / 30min
overhead
Merge-commits are inconvenient.
Hated by many, ignored by most, liked by few.
AF - boost project
dev
AF - high value for product
AF - faster review x1.5-5 times
Reviews approaches vary a lot
Linear history is always nice
git-blame and -bisect are cool
✅
✅ F***g s***y! How to f***g backport it?
⚠️ In large teams, constantly rebasing can be painful. Merge commits let you integrate changes without rewriting history. This requires less efforts but with rebasing, even though it needs a bit more time, makes things easier in the long run.
AF - 10% / 30min
overhead
AF - boost project
dev
AF - high value for product
AF - faster review x1.5-5 times
Reviews approaches vary a lot
Linear history is always nice
git-blame and -bisect are cool
Merge-commits are trash
"Atomic Flow" takes ~10% of dev time.
But < 30 mins per task.
"Atomic Flow" boosts the project.
Faster and better reviews and investigations.
"Atomic Flow" is high value for "Product" project.
Low value for "Enterprise" and "Startup".
Half devs look at whole diff
, half - at each commit when possible.
Some do both.
"Atomic Flow" review is quantified to be faster 1.5 to 5 times.
Due to how easy it is to read the patches.
Linear history is better in every sense. People choose it when have a choice.
Varying in details like if merge-commits are needed.
git-blame
and git-bisect
are used often.
Many use it daily or weekly.
Merge-commits are inconvenient.
Hated by many, ignored by most, liked by few.
Atomic Commits. The smallest logical unit. Nothing can be removed, nothing needs to be added.
Patchsets. Work on dev branches contains a sequence of atomic commits.
Linear history. Merge commits are eliminated or empty. Rebase before merge always.
Interactive rebase. The main universal tool in your "Atomic Flow" toolbox.
Focus on atomicity, history, reviews, longevity.
Focus on simplicity, branch structure and linearity.
Focus on merging, branching, releases.
By Adam Ruka - see the wonderful article series.
By Vladislav Shpilevoy
The presentation shows a Git workflow - "Atomic Flow" - which helps to manage the so called high-stakes projects. It shows the cost of inefficient Git usage for the team, explains the pros and cons of merge commits and rebasing, shows the huge importance of the "atomic commits", and closes up with some benchmarks and real life examples how "Atomic Flow" was helping people.
Backend C++ developer at VirtualMinds. Database C developer at Tarantool.