DevOps
Technologies for Tomorrow
Rúben Barros - ISEP - 1100667@isep.ipp.pt
Orientador: Doutor Ângelo Martins
Summary
- Context
- DevOps - Introduction
- Thesis Proposal
- DevOps - Categories & Tools
- Validation
- Conclusions
Context
Development
Traditional Software Development
Test/QA
Operations
Need for Change
Operations goals:
- Server uptime;
- Application response time.
Traditional Software Development
Development goals:
- Faster development.
Fear of Change
Source: ITSM/Serena.com 2012 study of IT professionals
Traditional Software Development
75% of Devs says that
Ops is a Roadblock
72% of Ops says that
Dev is not Supportive
What if they...
- Were faster in time-to-market by deploying more often?
- Didn't had to choose between stability and new features?
- Could increase their effectiveness?
The Internet
Worldwide Network
Distributed computing resources
Empowers computer processing and scaling
Cloud Computing
- Cloud Computing allows organizations to consume computing resources as a utility;
- Organizations no longer require investment in hardware and people operating it;
DevOps
Introduction
DevOps
Patrick Debois
Agile Infrastructure
by: Andrew Shafer
10+ deploys per day:
Dev and Ops cooperation at Flickr
by: Paul Hammond & John Allspaw
DevOps (Development and Operations) describes a culture in which the departments of development, operations, and quality assurance collaborate to deliver software in a continuous manner. [Sharma and Coyne, 2015].
DevOps
DevOps = New Mindset + New Tools + New Skills
DevOps
- Automate Code Testing;
- Automate Workflows;
- Automate Infrastructure.
Automate Everything
Source: Stephen Elliot, 2014 - DevOps and the Cost of Downtime: Fortune 1000 Best Practice Metrics Quantified
DevOps metrics from 20+ Fortune 1000 organizations:
- Failure costs between $100,000 to 1$ million per hour;
- Monthly deployments are expected to double in two years;
- 25% of the time during an application’s life cycle is considered unnecessary.
- DevOps will accelerate the delivery of functionalities by 15–20%.
IT operations statistics
Thesis Proposal
Problem
Research Question
What information can we capture and how can we formalize it so that we can improve how software teams practice DevOps?
Goals
- Research DevOps and DevOps tools;
- Identify categories to aggregate DevOps tools in;
- Identify key functionalities for each category;
- Discuss forces influencing the adoption of the available tools;
- Cooperate with Software Development Teams;
- Elaborate a DevOps Knowledge Map;
- Validate the captured information.
DevOps
Categories & Tools
Knowledge Map
DevOps Categories - Diagram
Infrastructure - Cloud
|
|
|
||
---|---|---|---|---|
Europe | ✔ * 2 | ✔ * 2 | ✔ | ✔ * 3 |
Asia | ✔ * 4 | ✔ * 7 | ✔ | ✔ |
North America | ✔ * 4 | ✔ * 6 | ✔ * 6 | ✔ * 3 |
South America | ✔ | ✔ | ✘ | ✘ |
Africa | ✘ | ✘ | ✘ | ✘ |
Oceania | ✔ | ✔ * 2 | ✘ | ✘ |
MySQL | ✔ | ✔ | ✔ | ✘ |
PostgreSQL | ✔ | ✘ | ✘ | ✘ |
SQL Server | ✔ | ✔ | ✘ | ✘ |
MongoDB | ✘ (DynamoDB) | ✔ | ✘ (Cloud BigTable) | ✘ |
Free | 1 year (t2.micro) | 60 minutes/CPU daily | ✘ | ✘ |
Infrastructure - Cloud
Infrastructure - In-House
|
|||
---|---|---|---|
x86 | ✔ | ✔ | ✔ |
x64 | ✔ | ✔ | ✔ |
ARM | ✔ | ✘ | ✔ |
Linux | ✔ | ✔ | ✔ |
Windows | ✔ | ✔ | Guest |
Solaris | ✔ | ✔ | Guest |
Full Virtualization | ✔ | ✔ | ✔ |
Paravirtualization | ✔ | ✔ (ESXi Hypervisor) | ✔ |
Live Migration | ✔ | ✔ | ✔ |
Infrastructure - In-House
Virtualization and Provisioning
- Easily create and destroy servers;
- Create servers through configuration files;
- Dev and Prod environment parity.
Virtualization and Provisioning
The development environment should be as similar as possible to the production environment
- Ubuntu 16.04 LTS => python 3.5
- > Ubuntu 15.10 => python >3.4
Virtualization and Provisioning
Scheduling
- Update servers;
- Run scripts.
0 6 * * 1-5
"At 06:00 on Mon, Tue, Wed, Thu and Fri."
5 0 * 8 *
"At 00:05 every day in Aug."
Scheduling
Cron
Chronos
(Mesos, open-source cluster manager)
Minutes | Hours | Days | Months | Weekdays |
---|
Testing
Testing
E2E Tools
Unit Testing Tools
TESTNG
Deployment Automation
Deployment Automation
Murphy's Law -"If anything can go wrong, it will go wrong."
Manual deployments are error prone.
Anyone in the team is able to deploy software.
Engineers spend more time developing.
Deploying to somewhere new is a matter of configuration.
This allows more frequent updates!
|
||||
---|---|---|---|---|
AWS, Azure, Google CP | ✔ | ✔ | ✔ | ✔ |
Script Language | Ruby | Puppet DSL Ruby DSL |
YAML | K\V (Json) |
Linux | ✔ | ✔ | ✔ | ✔ |
Windows | ✔ | ✔ | PowerShell 3.0 | ✘ |
Middleman Server | ✔ | ✔ | ✘ | ✔ |
Node Agent | ✔ | ✔ | ✘ (SSH) | ✘ |
Push Commands | ✔ (not natively) | ✔ (MCollective) | ✔ | ✔ |
Immutable Infrastructure | ✘ | ✘ | ✘ | ✔ |
Free plan | Unlimited nodes Services limited to 25 nodes |
Limited to 10 nodes | ✔ (no WebUI) | ✘ (Packer, Consul, Terraform) |
Basic Plan | $72 per node (min. of 20 nodes) |
120$ per node |
Ansible Tower $5,000 year for 100 nodes |
Price undisclosed |
Deployment Automation - Applications
Supervision
- Boot application at boot;
- Ensure application is running;
- Restart application if it fails.
Supervision
systemd |
|
|||
---|---|---|---|---|
Host | Ubuntu | Redhat/Fedora | UNIX | UNIX |
Act as UNIX's init | ✔ | ✔ | ✘ | ✘ |
Log rotation | ✔ (logrotate + copytruncate) |
✔ | ✔ | ✔ |
Script Language | Shell | Shell | Configuration | Python |
Start several instances of a program | ✘ | ✘ | ✔ | ✔ |
GUI | ✘ | ✘ | ✘ | ✔ |
Service Discovery
- Replaces the hardcoded addresses;
- Store servers metadata.
Service Discovery
|
|||
---|---|---|---|
Linux | ✔ | ✔ | ✔ |
Windows | ✔ | ✔ | ✘ |
Mac OS | ✔ | ✔ | ✘ |
Client-side | Server Active Connection + Keep-Alive | - | Gossip Protocol |
Node Health Check | Ping | - | HTTP Ping |
Built with | Java | Go | Go |
Intrinsic support for multiple datacenters | ✘ | ✘ | ✔ |
Monitoring
Monitoring
Type | Goal |
---|---|
Server Monitoring | CPU, RAM, Disk Space, Network Traffic, ... |
Application Monitoring | Performance impact of: > Specific code segments; > SQL statements. |
R.U.M. | Application performance in real time |
Geolocation and load times of users | |
Javascript errors |
|
||||
---|---|---|---|---|
North America, Asia, Europe, Oceania | ✔ | ✔ | ✔ | ✔ |
South America | ✔ | ✔ | ✔ | ✘ |
Africa | ✘ | ✘ | ✔ | ✘ |
Application Monitoring | ✔ | ✔ | ✘ | ✘ |
Mobile Monitoring | ✔ | ✔ | ✘ | ✘ |
Server Monitoring | ✔ | ✔ | ✘ | ✘ |
R.U.M. | ✔ | ✔ | ✘ | ✔ |
Free account | ✘ | ✔ (1 day retention) | ✔ | ✘ |
HTTP Health Check | $0.20 (100 checks) | $99 (10k checks) | $20 (430k checks) | $13 (430k checks) |
R.U. sessions | $0.20 (500 sessions) | $199 (500k pageviews) | ✘ | ▲ (100k pageviews) |
Monitoring - Applications
Logging
- Application;
- Services and processes;
- Servers.
|
|||
---|---|---|---|
Free Plan | ✔ (500Mb/day) | ✔ (200Mb/day + 7 days retention) | Open Source |
Basic Plan | $170/mo (1GB/day) | $55/mo (1GB/day + 7 days retention) | ✘ |
In-House | Splunk Enterprise Splunk Light |
✘ | ✔ |
Cloud based | Splunk Cloud | ✔ | ✔ (Outsource) |
Logging
Validation
Methodology & Strategy
- Find organizations
- Interview
- Provide the Knowledge Map
- Interview and compare
Quasi-experiment
Pre-existing groups of developers and operators in an organization
Metrics
Parameters | Type |
---|---|
Dev and Prod environment parity | Qualitative |
Reduce the time someone provisions the infrastructure | Minutes |
Reduce the time someone is testing the application | Minutes |
Reduce the time for the application to go from development to production | Days |
Reduce the time spent when deploying | Minutes |
Reduce downtime when deploying | Minutes |
Increase the number of deploys within a month | Quantitative |
Reduce the time to notice an error | Hours |
Reduce the time to find an error | Hours |
Reduce the concern for the well-being of the servers | Qualitative |
Increase the trust level when updating the application | Qualitative |
The Team
Composed of 2 people:
- One recently master graduated from Medical Informatics;
- One with ten years experience in software development;
- Both without knowledge of agile development and testing.
Metric | Observed Value |
---|---|
Time for the team to test the application | 90 minutes |
How do they notice errors | User feedback and manual testing |
Trust level when updating the application | 4 |
Motivation to learn DevOps | 1 |
The Experiment
Objectives | Results |
---|---|
Duration | 4 weeks |
Reunion | weekly |
Backend Service |
|
Frontend Service |
|
Infrastructure |
|
Provisioning and Virtualization |
|
Unit Testing |
|
End-to-end Testing |
|
Deployment |
|
Yet to implement
Discussion
The team had no motivation and skills to adopt DevOps
Identification of base competencies
Conclusions
Contributions
Objetive | Result |
---|---|
Collection of knowledge concerning DevOps | >Centralized information regarding the DevOps work cycle >Different market leader and innovative tools aggregated in categories |
Concerns regarding DevOps tools | Features, price, and learning curve |
Choosing the tools | Different answers for different scenarios |
Team improvement | ✘ |
Future Work
- Complete Validation
- Research Update
- Take the study to a wider population
DevOps
Technologies for Tomorrow
Rúben Barros - ISEP - 1100667@isep.ipp.pt
Orientador: Doutor Ângelo Martins
Thesis - DevOps Technologies for Tomorrow
By xumbino
Thesis - DevOps Technologies for Tomorrow
- 1,561