CONTINuOUS
INTEGRATION
Software Engineering Lab.
Spring 2017
definition
- Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.
- It is NOT 'yet another tasting method' (not to be mistaken with Integration Testing)
- It defines: when/how to run out test cases
what we'll do
-
TLDR; talk through the basics and answer some fundamental question:
- What exactly is it?
- Why should it be considered?
- For what cost? Is it worth it?
- How should it be applied in practice?
-
Example tools:
- TravisCI
- Jenkins
- Recap by looking at the advantages and disadvantages
The basics
the holy integration process
-
Integration Testing:
Assuring that different components/services that make an artifact cooperate with each other without an error -
Continues Integration:
Assuring that a new piece of code will not cause a crash after merging with what's already stabilized- Integration will probably include both Unit and Integration testing
The basics
the holy integration process
- An inevitable part of software development
- Even if you are developing by yourself
- Long, Frustrating, Redundant
- Reason: Accumulation of a large amount of codebase changes, waiting to be merged into the original codebase
-
Solution: what we intuitively do agains accumulation
- Instead of postponing a ponderous task to future --wrongly pretexting that it is 'for the greater good'--, do it so frequent so that it becomes a normal event
The basics
an example
- Let's assume I have to do something to a piece of software. assume it's small and can be done in a few hours
- Get a clone of the stable version from a source control
- Do whatever needs to be done. Should include:
- Changes in source code
- Changes in Test Cases
- Build and Test locally
- Pull the latest version from the mainline stream (why?)
- Rebuild and Re-test
- Fix the conflicts if step 5 fails
The basics
an example
7. Push changes for the Integration Machine to be built tested again
8. Wait for test results
9. Fix the conflicts if step 9 fails
10. It's Friday. Go home.
This example might seem both naive and overcomplicated.
Let's look at some practices used in adopting this testing habit.
practice #1: Maintain a Single Source Repository
-
EVERYTHING must be included in the core source code repository
- "The basic rule of thumb is that you should be able to walk up to the project with a virgin machine, do a checkout, and be able to fully build the system" -- Martin Fowler
- Even IDE configurations are recommended
-
Build scripts are compelled
- Makefiles
- The build result is not recommended
- Note that 3rd party libs. also require build processes
- An absolute must-have for maintaining stable/staging versions of the source code
practice #2: Automate the Build
- Elaborate the last quote:
- "The basic rule of thumb is that you should be able to walk up to the project with a virgin machine, do a checkout, and be able to fully build the system with a single command" -- Martin Fowler
- An initiative goal was speed, automation will indeed help that.
- IDE build tools should be avoided in main tests.
- Long build should be avoided
- Ideally, a good build tool analyzes what needs to be changed as part of the process and only compiles them.
practice #3: Make Your Build Self-Testing
- Traditionally a build means compiling, linking, and all the additional stuff required to get a program to execute. A program may run, but that doesn't mean it does the right thing.
-
A good way to catch bugs more quickly and efficiently is to include automated tests in the build process.
-
The rise of TDD and XP has had a great impact on what's called Self-Testing code/build
-
They both emphasize: Writing test before code
-
We have weaker requirements for Self-Testing code:
-
Good coverage - Simple to run - Embeddable in build process
-
-
practice #3: Make Your Build Self-Testing
-
Of course you can't count on tests to find everything
- "Imperfect tests, run frequently, are much better than perfect tests that are never written at all"
practice #4: Everyone Commits To the Mainline Every Day
-
Integration is primarily about communication and responsibility
-
Integration allows developers to tell other developers about the changes they have made. Frequent communication allows people to know quickly as changes develop.
-
By keeping the time between changes short, bugs are easier to find. Changes are not widespread
-
"The key to fixing problems quickly is finding them quickly"
-
-
The more frequently you commit, the less places you have to look for conflict errors, and the more rapidly you fix conflicts.
-
Frequent commits encourage developers to break down their work into small chunks of a few hours each
practice #5: Every Commit Should Build the Mainline on an Integration Machine
-
Using daily commits, a team gets frequent tested builds. This ought to mean that the mainline stays in a healthy state. In practice, however, things still do go wrong.
-
An untested commit
-
Development machine variation
-
-
As a result, each commit should be tested on a separate machine (aka. Continuous Integration Server)
-
a CI server is a monitor over the mainline source code
-
Checkout a new version after every commit, build, test, notify the developer
-
Scheduled builds and tests
-
practice #6: Fix Broken Builds Immediately
-
The mainline is the Holy Grail of the development focus
-
If it fails against a commit, it should be fixed immediately
-
Most of the time, due to the short gap between current commit and the last one, the reason is obvious
-
Might lead to reverting the mainline and giving the commit a second thought.
-
-
"nobody has a higher priority task than fixing the build" -- Kent Beck
-
Not everyone needs to stop doing what they do and struggle to fix the build
-
It further encourages strong communication and collaboration for a unified goal
-
practice #7: Keep the Build Fast
-
The whole point of Continuous Integration is to provide rapid feedback
-
At most, 10 minutes build time is within reason
-
-
Most small projects will build within minutes, but not enterprise applications
-
End to End testing
-
Service Discovery Scenarios
-
DB testing: inserting and modifying millions of records
-
Load Testing: benchmarking server's response time under a huge load for a long time
-
-
An accepted solution is to have multiple build stages
practice #7: Keep the Build Fast
-
Stage 1:
-
a commit will be tested agains fast Unit Tests
-
The mainline is updated for other developers. The end product might not. he or she can go home afterwards
-
-
Stage 2:
-
Longs tests will run later, perhaps in parallel
-
Bugs caught by stage n should be transformed into small chunks of fast test and migrated into stage m, where m < n
-
Tests must progress/improve through time
-
practice #8: Test in a Clone of the Production Environment
-
CI Server was introduced for primarily two reasons:
-
Avoid dependency on development environment
-
24/7 monitoring over the mainline
-
-
Further emphasis on this leads to striving to duplicate everything from production machine.
-
Might not always be possible. Mimicking every single parameter of the production environment is time consuming (about which we talked in #7)
-
Nowadays, most CI servers do this up to a certain degree
-
practice #9: Everyone can see what's happening
-
One of the most important things to communicate is the state of the mainline build
-
Most open source projects have multiple github badges
-
Some companies that use internal source control and CI servers change the ambient of their room based on build status.
-
Making everyone involved encourages them to stay involved with the project (simple, yet effective Gamification)
continues integration tools
jenkins
travis ci
ci tools
travis ci
- Minimal
- Simple configuration
- Vast Language support
- Sufficient reporting
- Used for most independent and open source projects
- Unlimited and free for public repositories
- Supports ONLY github
ci tools
Jenkins
- Comprehensive
- It comes at a price: Learning curve
- Many plugins and reporters:
- Code quality
- Code style
- Self-Contained executable
- You need a private server
ci tools: travis ci
build pipeline: node-js
Text
- Everything starts with a very simple .travis.yml file
language: node_js
node_js:
- "7"
- This file must be added to the root of the source code
ci tools: travis ci
build pipeline: node-js
Text
- Each build has two phases
- install: default setup script of the language (npm install)
- script: the default test script of the language (npm test)
- Both can be overwritten:
ci tools: travis ci
build pipeline: node-js
Text
language: node_js
node_js:
- "7"
install: ./install-dependencies.sh
// or
install:
- bundle install --path vendor/bundle
- npm install
script: ./custom-test.sh
// or
script:
- mytest --run
- npm test
ci tools: travis ci
build pipeline: node-js
install: ./foo.sh
before_script:
- apt-get install redis
- redis-server
after_success:
- ./yoohoo.sh
after_failure:
- ./revert-all.sh
Hooks can be added to different phases of the build process
Install
script
ci tools: travis ci
build pipeline: node-js
deploy:
provider: npm
after_deploy: ./update-doc.sh
before_deploy: ./clean-up.sh
Optional deploys can be added using Continues Providers (npm)
ci tools: travis ci
build pipeline: node-js
before_install:
- sudo apt-get update -qq
- sudo apt-get install -qq [packages list]
Packages can / should be installed using of the hooks depending on type
ci tools: travis ci
build pipeline: target branch
# blocklist
branches:
except:
- legacy
- experimental
# safelist
branches:
only:
- master
- stable
ci tools: travis ci
build pipeline: skipping a commit
git commit -am "this will be ignored [ci skip]"
git commit -am "also this [skip ci]"
ci tools: travis ci
build pipeline: Build matrix
language: ruby
rvm:
- 1.9.3
- 2.0.0
- 2.1.0
env:
- DB=mongodb
- DB=redis
- DB=mysql
gemfile:
- Gemfile
- gemfiles/rails4.gemfile
- gemfiles/rails31.gemfile
- gemfiles/rails32.gemfile
ci tools: travis ci
build pipeline: Build matrix
- The last .travis.yml file included 3 * 3 * 4 tasks
- This can be further modified
matrix:
exclude:
- rvm: 2.0.0
gemfile: Gemfile
matrix:
exclude:
- rvm: 2.0.0
gemfile: Gemfile
env: DB=mongodb
- rvm: 2.0.0
gemfile: Gemfile
env: DB=redis
- rvm: 2.0.0
gemfile: Gemfile
env: DB=mysql
ci tools: travis ci
example
- Start with an existing project
- Create a new branch
- Add a commit -> Test locally
- Push to new branch
- Observe build status online
- Create a pull request
ci tools: travis ci
additional
- many features were omitted:
- Docker support
- GUI testing: Sauce Labs
- Cron Jobs
- 1 Build -> n Tasks
- Pull request integration
- Badges
- Command line interface
recap: CI benifits
Text
- The trouble with deferred integration is that it's very hard to predict how long it will take to do, and worse it's very hard to see how far you are through the process (complete blind spot).
- There's no long integration, you completely eliminate the blind spot.
- Continuous Integrations doesn't get rid of bugs, but it does make them dramatically easier to find and remove.
- As a result projects with Continuous Integration tend to have dramatically less bugs, both in production and in process
recap: CI benifits
- CI removes one of the biggest barriers to frequent deployment.
- Significant effect on team efficiency.
CI - Software Engineering Lab
By Kian Peymani
CI - Software Engineering Lab
- 639