DevOps
Technologies for Tomorrow
Rúben dos Santos Barros
Summary
- Traditional Software Development
- DevOps
- Research Question
- Goals
- The Road So Far
- Conclusions
Development
Traditional Software Development
Test/QA
Operations
Need for Change
Operations goals:
- Server uptime;
- Application response time.
Traditional Software Development
Development goals:
- Faster development.
Fear of Change
Source: ITSM/Serena.com 2012 study of IT professionals
Traditional Software Development
75% says that
Ops is a Roadblock
72% says that
Dev is not Supportive
What if we...
- Were faster in time-to-market by deploying more often?
- Didn't had to choose between stability and new features?
- Could increase our effectiveness?
DevOps
August 2008
Toronto, Canada
Agile Infrastructure
October 2009
Ghent, Belgium
Developers + System Administrators
Andrew Shafer
Patrick Debois
DevOps (Development and Operations) describes a culture in which business owners and the development, operations, and quality assurance departments collaborate to deliver software in a continuous manner and encourages practices to evolve to meet that culture focusing on business instead of departmental objectives. [Sharma and Coyne, 2015].
DevOps
DevOps = New Mindset + New Tools + New Skills
DevOps
Automation
- Automate Code Testing;
Automate Everything
- Automate Workflows;
- Automate Infrastructure.
Source: Stephen Elliot, 2014 - DevOps and the Cost of Downtime: Fortune 1000 Best Practice Metrics Quantified
DevOps metrics from 20+ Fortune 1000 organizations:
- Infrastructure failure costs $100,000 per hour;
- Critical application failure costs $500,000 to $1 million per hour;
- The number of deployments per month is expected to double in two years;
- During an application’s development, testing, deployment, and operations life cycle 25% of spent time is considered wasteful and unnecessary.
- DevOps-led projects will accelerate the delivery of functionalities to the customer by 15–20%.
IT operations statistics
Research Question
What information can we capture and how can we formalize it so that we can improve how software teams practice DevOps?
Goals
- Research DevOps and DevOps tools;
- Identify categories to aggregate DevOps tools in;
- Identify key functionalities for each category;
- Discuss forces influencing the adoption of the available tools;
- Identify requirements when starting a new project which will influence technological decisions;
- Cooperate with Software Development Teams;
- Elaborate a DevOps Knowledge Map;
- Validate the captured information;
- Publish conclusions.
The Road So Far
DevOps Categories
Cloud
Cloud
- Cloud Computing allows companies to consume computing resources as a utility;
- Companies no longer require investment in hardware and people operating it;
- The same server can serve multiple applications depending the work hours of a country or continent.
AWS | Azure | Google CP | Digital Ocean | |
---|---|---|---|---|
Europe | ✔ * 2 | ✔ * 2 | ✔ | ✔ * 3 |
Asia | ✔ * 4 | ✔ * 7 | ✔ | ✔ |
North America | ✔ * 4 | ✔ * 6 | ✔ * 6 | ✔ * 3 |
South America | ✔ | ✔ | ✘ | ✘ |
Africa | ✘ | ✘ | ✘ | ✘ |
Oceania | ✔ | ✔ * 2 | ✘ | ✘ |
MySQL | ✔ | ✔ | ✔ | ✘ |
PostgreSQL | ✔ | ✘ | ✘ | ✘ |
SQL Server | ✔ | ✔ | ✘ | ✘ |
MongoDB | ✘ (DynamoDB) | ✔ | ✘ (Cloud BigTable) | ✘ |
About Cloud Providers
AWS | Azure | Google CP | Digital Ocean | |
---|---|---|---|---|
Free | 1 year (t2.micro) | 60 minutes/CPU daily | ✘ | ✘ |
Linux VM (1Ghz, 1GB RAM) | $10.08 (monthly) | $14.40 (0.75GB RAM) | $4.03 (shared CPU, 0.6GB RAM) | $10.80 |
Linux VM (2Ghz, 4GB RAM) | $40.32 (monthly) | $86.40 (3.5GB RAM) | $50.40 (7.5GB RAM) | $43.20 |
Linux VM (4Ghz, 16GB RAM) | $190.08 (monthly) | $210.24 (14GB RAM) | $100.80 (15GB RAM) | $85.68 (8GB RAM) |
Windows VM (1Ghz, 1GB RAM) | $13.68 | $12.96 (0.75GB RAM) | $18.43 (shared CPU, 0.6GB RAM) | ✘ |
Windows VM (2Ghz, 4GB RAM) | $54.72 | $108 (3.5GB RAM) | $108 (7.5GB RAM) | ✘ |
Windows VM (4Ghz, 16GB RAM) | $351.36 | $371.52 (14GB RAM) | $216 (15GB RAM) | ✘ |
Object Storage | S3 - $0.03/GB | Blob - $0.08/GB | Disk - $0.04/GB | ✘ |
About Cloud Providers
Infrastructure Management
Infrastructure Management
Xen | KVM | VMware vSphere | |
---|---|---|---|
x86 | ✔ | ✔ | ✔ |
x86_64 | ✔ | ✔ | ✔ |
ARM | ✔ | ✘ | ✘ |
Linux | ✔ (Host - special distro/XenServer) | ✔ | ✔ |
Windows | ✔ | ✔ (Guest) | ✔ |
Solaris | ✔ | ✔ (Guest) | ✔ |
Full Virtualization | ✔ | ✔ | ✔ |
Hypervisor | ✔ | ✔ | ✔ (vSphere) |
Paravirtualization | ✔ | ✔ | ✔ (ESXi Hypervisor) |
Live Migration | ✔ | ✔ | ✔ |
Used by | Amazon, Linode | ? | Adobe, Vodafone |
Scheduling
0 6 * * 1-5
"At 06:00 on Mon, Tue, Wed, Thu and Fri."
5 0 * 8 *
"At 00:05 every day in Aug."
Scheduling
- Cron
- Chronos (Mesos)
UNIX's Cron Example:
Automation
Automation - Provisioning
The development environment should be as similar as possible to the production environment
- Ubuntu 16.04 LTS => python 3.5
- > Ubuntu 15.10 => python >3.4
Automation - Provisioning
Automation - Provisioning
Vagrant.configure(2) do |config|
config.vm.box = "hashicorp/precise64"
config.vm.provision :shell, path: "bootstrap.sh"
config.vm.network :forwarded_port, guest: 80, host: 4567
config.vm.provider :virtualbox do |vb|
vb.customize [
"modifyvm", :id,
"--cpuexecutioncap", "50",
"--memory", "256",
]
end
config.vm.provision :puppet do |puppet|
puppet.manifests_path = "puppet/manifests"
puppet.manifest_file = "site.pp"
puppet.module_path = "puppet/modules"
end
end
Automation - Test
Automation - Test
E2E Tools
Unit Testing Tools
TESTNG
Automation - Test
public class CalculatorTest {
private Calculator classUnderTest;
@Test
public void testSubstract() {
assertEquals("substract", 2, classUnderTest.substract(5, 3));
}
@Test
public void testMultiply() {
assertEquals("multiply", 56, classUnderTest.multiply(7, 8));
}
}
Automation - Test
describe('adding user to application', function() {
var random = browser.params.random;
it('should fill and submit a user', function(){
element(by.id('a_user_dropdown')).click();
element(by.id('a_user_add')).click();
expect(browser.getCurrentUrl()).toContain('users/add');
element(by.model('form_user.name')).sendKeys('user' + random);
element(by.model('form_user.username')).sendKeys('user'+ random);
browser.params.handleSelect2('form_user.country', 'Portugal');
element(by.model('form_user.email')).sendKeys('user' + random + '@gmail.com');
element(by.model('form_user.password')).sendKeys('user' + random);
element(by.id('btn_user_submit')).click();
expect(browser.getCurrentUrl()).toEqual(browser.params.url + '/#/users');
});
});
Automation - Deploy
"If anything can go wrong, it will go wrong." - Murphy's Law
Manual deployments are error prone.
Anyone in the team is able to deploy software.
Engineers spend more time developing.
Deploying to somewhere new is a matter of configuration.
This way we can release more often!
Automation - Deploy
---
- hosts: webservers
vars:
http_port: 80
max_clients: 200
remote_user: root
tasks:
- name: ensure apache is at the latest version
yum: name=httpd state=latest
- name: write the apache config file
template: src=/srv/httpd.j2 dest=/etc/httpd.conf
notify:
- restart apache
- name: ensure apache is running (and enable it at boot)
service: name=httpd state=started enabled=yes
handlers:
- name: restart apache
service: name=httpd state=restarted
[webservers]
webserver1
webserver2
webserver3
[dbservers]
dbserver1
dbserver2
hosts.example
site.yml - Ansible Script
Automation - Roll Back
puppi rollback <project_name>
Puppi command - Puppet Module
deploy_revision 'name' do
...
action :rollback
end
Chef Recipe
Automation - Applications
Chef | Puppet | Ansible | Atlas | |
---|---|---|---|---|
AWS | ✔ | ✔ | ✔ | ✔ |
Azure | ✔ | ✔ | ✔ | ✔ |
Google CP | ✔ | ✔ | ✔ | ✔ |
OpenStack | ✘ | ✔ | ✘ | ✔ |
Script Language | Ruby | Puppet Specific | YAML | K\V (Json) |
Node Agent | ✔ | ✘ (SSH) | ||
Middleman Server | ✔ | ? | ✘ | ✘ |
Push Commands | ✘ | ✔ | ||
Free plan | ✔ (5 nodes) | ✔ (10 nodes) | ✔ (10 nodes) - no WebUI | ✘ (Packer + Terraform + Consul) |
Basic Plan (monthly) | 1,440$ (20 nodes) | 120$ (node) | 5,000$ (100 nodes) | ? |
Used by | Google, Harvard | Apple, NASA |
About Automation Tools
Monitoring
Monitoring - Servers/VMs
Information about:
- CPU;
- RAM;
- Disk space;
- Network traffic;
- Processes;
- Services;
Monitoring - Applications
- Application performance impacts:
- SQL statements;
- Specific code segments;
- Flag key transactions;
Monitoring - R.U.M.
- Geolocation and load times of users;
- User's movement through the application;
- Find application bottlenecks;
- Application performance in real time;
- Javascript errors.
Monitoring - Applications
RUXIT | New Relic | Status Cake | Pingdom | |
---|---|---|---|---|
North America | ✔ * 7 | ✔ * 7 | ✔ * 9 | ✔ * 3 |
South America | ✔ | ✔ | ✔ * 3 | ✘ |
Europe | ✔ * 3 | ✔ * 3 | ✔ * 17 | ✔ * 2 |
Africa | ✘ | ✘ | ✔ | ✘ |
Asia | ✔ | ✔ * 2 | ✔ * 4 | ✘ |
Oceania | ✔ | ✔ | ✔ * 2 | ✔ |
Application Monitoring | ≃ ✔ | ✔ | ✘ | ✘ |
Mobile Monitoring | ✘ | ✔ | ✘ | ✘ |
Node Monitoring | ✔ | ✔ | ✘ | ✘ |
R.U.M. | ✔ | ✔ | ✔ | ✔ |
Free account | ✘ | ✔ (1 day rec's) | ✔ | ✘ |
HTTP Health Check | 0.20$ (100 checks) | 99$ (10k checks) | 20€ (43k checks) | 13€ (430k checks) |
R.U. sessions | 0.20$ (500 sessions) | 200$ (500k pageviews) | Free | ▲ (100k pageviews) |
Node/App Pricing | =0.20$ (node/hour)= | Free/0.20$(hour) | ✘ | ✘ |
Mobile Pricing | ✘ | 1499$ (p/ app) | ✘ | ✘ |
About Monitoring Tools
Supervision
Supervision
"In the early days of Reddit, we didn’t really have any crash protection. I used to have to sleep with my laptop and I would wake up every couple of hours and see if Reddit was working, and restart it. It was the worst feeling in the world." - Steve Huffman, Reddit Founder's
- Boot Application/Service at Boot;
- Ensure Application/Service is Running;
- Restart Application/Service if it fails.
Supervision
Upstart | systemd | Supervisor | Circus | |
---|---|---|---|---|
Act as UNIX's init | ✔ | ✔ | ✘ | ✘ |
Log rotation | ✔ (logrotate + copytruncate) | ✔ | ✔ | ✔ |
Host | Ubuntu | Redhat/Fedora | ||
Start several instances of a program | ✘ | ✘ | ✔ | ✔ |
Script Language | Configuration + Shell | Configuration+ Shell | Configuration | Python (WSGI) |
HTTP Server | ✘ | ✘ | ✘ | ✔ |
Loggers
Loggers
2016-06-14 17:44:15,814 DEBUG TcpListener - New connection accepted
2016-06-14 17:44:15,820 ERROR HttpServerConnection - Aborting encrypted
connection to hostname.pt/192.168.1.1:46650 due to
[SSLHandshakeException:Client requested protocol SSLv3 not enabled or not supported] ->
[SSLHandshakeException:Client requested protocol SSLv3 not enabled or not supported]
2016-06-14 17:44:15,820 DEBUG HttpServerConnection - Connection was Aborted, awaiting TcpConnection termination...
2016-06-14 17:44:15,820 DEBUG HttpServerConnection - TcpConnection terminated, stopping
Loggers
Splunk | Loggly | LogStash (ELK) | |
---|---|---|---|
Free Plan | ✔ (500Mb/day) | ✔ (200Mb/day + 7 days retention) | Open Source |
Basic Plan | $170/mo (1GB/day) | $55/mo (1GB/day + 7 days retention) | $30 (1GB/day + 7 days retention - logit.io) |
In-House | ✔ (Splunk Enterprise) | ✔ | ✔ |
Cloud based | ✔ (Splunk Cloud) | ✘ | ✔ (Outsource) |
Target | Medium/Big Enterprise | Find and fix operational problems | Small Companies |
Service Discovery
Service Discovery
ZooKeeper | etcd | Consul | |
---|---|---|---|
Depends On | ✘ | Third-party tools (Registrator + confd) | ✘ |
Client-side | Server Active Connection + Keep-Alive | Gossip Protocol | |
Node Health Check | Ping | HTTP 200, RAM and Disk Check | |
Built with | Java | Go | Go |
Embedded Service Discovery System | ✘ | ✘ | ✔ |
Validation
Evaluated Metrics
- Time testing the product?
- Time to go from development to deploy?
- Time to deploy the product?
- Downtime for the update?
- Deploys per week/month?
- Time to notice an error?
- Time to roll back an update?
- Configuration issues in production?
DevOps
Technologies for Tomorrow
Rúben dos Santos Barros
DevOps - Long Presentation
By xumbino
DevOps - Long Presentation
- 1,289