Continuous Integration

Continuous Delivery

- and -

Continuous Integration (CI)

the practice of merging all developer working copies to a shared mainline several times a day

Let's talk about branches

Time is wasted on understanding change
Time is wasted on explaining change
Time is wasted on resolving conflict
And just when you are ready to push resolution...
Someone else already merged another branch

Why is this so bad?

Merge conflicts waste a lot of time
Merge conflicts cause duplicate work
Merge conflicts are frustrating to deal with
We make shortcuts when resolving conflicts:
- skip tests
- don't look for the best solution
- just want to get over with it
We are afraid to refactor large code base because we know chances are we'll have to deal with merge conflicts when we'll finally be ready to push.

How does CI solve this?

CI means that everybody needs to push all of the code in their working folder to master as often as they can.
The more often you push, the less risk you're in for having a conflict.
Your local master is also a branch, if it takes you a few days to push your changes, you will have the same problems.
If you don't pull latest code before starting to work, you will have the same problems.

CI and automated builds

The most important thing that allows us to integrate code so often to master without fearing of breaking everything is tests.
We cannot rely only on unit tests, tests must verify that integration between all system components works.
CI servers such as Jenkins, TeamCity, Travis are responsible for running builds and tests after every commit is pushed & giving quick feedback to author.

What Needs to happen for this to work?

Tests Tests Tests & Integration Tests
Push small commits
Push as fast as possible
Run tests before you push
CI builds should be fast
CI builds should be stable
CI builds should be identical to local builds
CI server should report to committer on failures
Failures should be handled immediately
"Only fails in CI server" issues must be handled
"Only fails locally" issues must be handled

Let's talk about dependencies

When you change some code, how do you know your change didn't break anyone using that code?
You know the answer! It's tests. You rely on everyone using your code to have integration tests
When will you know that you broke someone using your code?
Immediately! Because your local tests are always running thanks to watch task & tools like wallaby
When you change some code in a library, how will you know you broke someone? When will you know?

CI & dependencies

CI means that new version of libraries is published automatically in every push and all dependents are built in CI server
CI means that the person who updates the library is responsible to fix any build that failed due to this
Hardcoded versions are much worse than branches, bumping hardcoded versions is like merging long lived branches, only worse
We happily pay the price of always maintaining backward compatibility & deprecations. We rarely bump major version

Pull Requests are Also Branches, right?

Yes, pull requests allows for online review process of branch before it is merged
We promote usage of pull requests only in libraries that effect many projects, we recommend others to review offline
Reviewers are instructed to review pull requests as soon as possible
Reviewers are instructed to avoid pushing to code areas which are effected by pending pull requests
We have special TeamCity that builds and tests all pull requests before they merged.

If I Already mentioned Pull Requests...

Disable merge commits, they are good for nothing, they help in nothing, they just fuck up your git history

Git Pull --rebase

Continuous Delivery (CD)

the practice of ensuring that the latest build can be reliably released at any given moment

This story is going to sound familiar...

Once upon a time, not so long ago, I worked for a company that released a version every month
First two weeks we worked on adding features
Then we went into feature freeze where no one is allowed to add new features and only bug fixes were allowed
Then in last week we were in code freeze where no one was allowed to push any code unless it was a critical bug fix
Then we released the version
Somewhere in the middle of this process we also created an additional branch for the next version and some of the people started to work on next release features.
BTW, I'm completely lying, we never ever released on time

What's so wrong about this?

Releasing new feature took too long
Shipping a bug fix took too long
Important bug fixes needed to be done in multiple version branches
Sometimes in different ways since code was already very different
A lot of "hidden unemployment" during freeze periods
Huge load on QA always testing current version, in progress version and next version

How Does CD Solve THis?

All developers can deploy a new version any time.
System is often broken down to smaller deployables.
All developers can monitor all of the main measurements of every part of the application (both technical and business).
All developers can rollback to previous version if they feel something went wrong.
Any version that is marked as GA ready can be deployed.
Any new feature, whether complete or in-progress is closed with toggle. Enabling of features is not part of deployment.

Create release candidate with a click of a button

Deploy with click of a button

Rollback with click of a button

Only single module is deployed

Acceptance tests run automatically on RC (soon)

Artifacts actively maintained by a team of 6 developers. Deployed at least once a week.

Watching New Relic carefully after deploy

Constantly Watching Mission Critical events and transactions in anodot

What is A toggle?

We don't mind pushing to master and deploying to production. Even if it is not ready. Even if it is totally broken.

As long as it is closed we happily CI/CD without fear and saving a shit load of time and effort that could have been wasted on developing it in a branch.

Lifecycle of a toggle

We create toggle for every change we do. It can be some new feature but it can also be something completely technical like refactoring some code.
We open gradually and carefully, usually to employees first and consult monitoring systems and business analysts on results depending on toggle.
Getting a toggle completely open to all users can take anywhere between an hour to a month.
At any time we can close and toggle in order to fix things and then reopen.
Once toggle is fully open for enough time we remove condition from code, leave only wanted behavior and deploy again.
Every time you decide to branch instead of toggle, you are wrong. Guaranteed by endless times people did the wrong thing and regretted.

Statics Override

Chrome extension allows you to see how production will look with your RC deployed

Great for making sure everything works well before you deploy

Great tool for manual QA

Same technology is used for acceptance tests

Petri Sidekick

Chrome extension allows you to see how production will look with your toggle enabled

Great for making sure everything works well before you enable

Great tool for manual QA

Same technology is used for acceptance tests

Elgar

Manage toggles and versions without chrome extension

Good for other browsers or mobile

You can also generate links with overrides to send to people

Is CI/CD Perfect?

Making builds reliable, fast & stable is hard
Huge dependency trees with unstable builds are an incredible burden which requires enormous efforts to stabilize
Keeping services backward compatible is hard
Having so much responsibility in developer hands is hard:
- Unit tests
- Integration tests
- Acceptance tests
- Deployment
- Monitoring
- Toggles
Getting everyone in organization to be positive about solving problems instead of running back to worse alternative is HARD
But it is worth it!

Questions?

https://github.com/wix-private/nothing-to-prod

CI/CD

By Shahar Talmi

CI/CD

8 years ago
2,097

Shahar Talmi

shahata

Continuous Integration

Continuous Delivery

- and -