Continuous Integration
Continuous Delivery
- and -
Continuous Integration (CI)
the practice of merging all developer working copies to a shared mainline several times a day
Let's talk about branches
- Time is wasted on understanding change
- Time is wasted on explaining change
- Time is wasted on resolving conflict
- And just when you are ready to push resolution...
- Someone else already merged another branch
Why is this so bad?
- Merge conflicts waste a lot of time
- Merge conflicts cause duplicate work
- Merge conflicts are frustrating to deal with
- We make shortcuts when resolving conflicts:
- skip tests
- don't look for the best solution
- just want to get over with it
- We are afraid to refactor large code base because we know chances are we'll have to deal with merge conflicts when we'll finally be ready to push.
How does CI solve this?
- CI means that everybody needs to push all of the code in their working folder to master as often as they can.
- The more often you push, the less risk you're in for having a conflict.
- Your local master is also a branch, if it takes you a few days to push your changes, you will have the same problems.
- If you don't pull latest code before starting to work, you will have the same problems.
CI and automated builds
- The most important thing that allows us to integrate code so often to master without fearing of breaking everything is tests.
- We cannot rely only on unit tests, tests must verify that integration between all system components works.
- CI servers such as Jenkins, TeamCity, Travis are responsible for running builds and tests after every commit is pushed & giving quick feedback to author.
What Needs to happen for this to work?
- Tests Tests Tests & Integration Tests
- Push small commits
- Push as fast as possible
- Run tests before you push
- CI builds should be fast
- CI builds should be stable
- CI builds should be identical to local builds
- CI server should report to committer on failures
- Failures should be handled immediately
- "Only fails in CI server" issues must be handled
- "Only fails locally" issues must be handled
Let's talk about dependencies
- When you change some code, how do you know your change didn't break anyone using that code?
- You know the answer! It's tests. You rely on everyone using your code to have integration tests
- When will you know that you broke someone using your code?
- Immediately! Because your local tests are always running thanks to watch task & tools like wallaby
- When you change some code in a library, how will you know you broke someone? When will you know?
CI & dependencies
- CI means that new version of libraries is published automatically in every push and all dependents are built in CI server
- CI means that the person who updates the library is responsible to fix any build that failed due to this
- Hardcoded versions are much worse than branches, bumping hardcoded versions is like merging long lived branches, only worse
- We happily pay the price of always maintaining backward compatibility & deprecations. We rarely bump major version
Pull Requests are Also Branches, right?
- Yes, pull requests allows for online review process of branch before it is merged
- We promote usage of pull requests only in libraries that effect many projects, we recommend others to review offline
- Reviewers are instructed to review pull requests as soon as possible
- Reviewers are instructed to avoid pushing to code areas which are effected by pending pull requests
- We have special TeamCity that builds and tests all pull requests before they merged.
If I Already mentioned Pull Requests...
Disable merge commits, they are good for nothing, they help in nothing, they just fuck up your git history
Git Pull --rebase
Continuous Delivery (CD)
the practice of ensuring that the latest build can be reliably released at any given moment
This story is going to sound familiar...
- Once upon a time, not so long ago, I worked for a company that released a version every month
- First two weeks we worked on adding features
- Then we went into feature freeze where no one is allowed to add new features and only bug fixes were allowed
- Then in last week we were in code freeze where no one was allowed to push any code unless it was a critical bug fix
- Then we released the version
- Somewhere in the middle of this process we also created an additional branch for the next version and some of the people started to work on next release features.
- BTW, I'm completely lying, we never ever released on time
What's so wrong about this?
- Releasing new feature took too long
- Shipping a bug fix took too long
- Important bug fixes needed to be done in multiple version branches
- Sometimes in different ways since code was already very different
- A lot of "hidden unemployment" during freeze periods
- Huge load on QA always testing current version, in progress version and next version
How Does CD Solve THis?
- All developers can deploy a new version any time.
- System is often broken down to smaller deployables.
- All developers can monitor all of the main measurements of every part of the application (both technical and business).
- All developers can rollback to previous version if they feel something went wrong.
- Any version that is marked as GA ready can be deployed.
- Any new feature, whether complete or in-progress is closed with toggle. Enabling of features is not part of deployment.
Create release candidate with a click of a button
Deploy with click of a button
Rollback with click of a button
Only single module is deployed
Acceptance tests run automatically on RC (soon)
Artifacts actively maintained by a team of 6 developers. Deployed at least once a week.
Watching New Relic carefully after deploy
Constantly Watching Mission Critical events and transactions in anodot
What is A toggle?
We don't mind pushing to master and deploying to production. Even if it is not ready. Even if it is totally broken.
As long as it is closed we happily CI/CD without fear and saving a shit load of time and effort that could have been wasted on developing it in a branch.
Lifecycle of a toggle
- We create toggle for every change we do. It can be some new feature but it can also be something completely technical like refactoring some code.
- We open gradually and carefully, usually to employees first and consult monitoring systems and business analysts on results depending on toggle.
- Getting a toggle completely open to all users can take anywhere between an hour to a month.
- At any time we can close and toggle in order to fix things and then reopen.
- Once toggle is fully open for enough time we remove condition from code, leave only wanted behavior and deploy again.
- Every time you decide to branch instead of toggle, you are wrong. Guaranteed by endless times people did the wrong thing and regretted.
Statics Override
Chrome extension allows you to see how production will look with your RC deployed
Great for making sure everything works well before you deploy
Great tool for manual QA
Same technology is used for acceptance tests
Petri Sidekick
Chrome extension allows you to see how production will look with your toggle enabled
Great for making sure everything works well before you enable
Great tool for manual QA
Same technology is used for acceptance tests
Elgar
Manage toggles and versions without chrome extension
Good for other browsers or mobile
You can also generate links with overrides to send to people
Is CI/CD Perfect?
- Making builds reliable, fast & stable is hard
- Huge dependency trees with unstable builds are an incredible burden which requires enormous efforts to stabilize
- Keeping services backward compatible is hard
- Having so much responsibility in developer hands is hard:
- Unit tests
- Integration tests
- Acceptance tests
- Deployment
- Monitoring
- Toggles
- Getting everyone in organization to be positive about solving problems instead of running back to worse alternative is HARD
- But it is worth it!
Questions?
CI/CD
By Shahar Talmi
CI/CD
- 2,015