aws && docker && chef
adam meghji
Co-Founder and CTO at @universe
Entrepreneur & hacker.
Rails APIs, EmberJS, DevOps.
http://djmarmalade.com.
Always inspired!
The Social Marketplace For Events
Ticketing platform enabling event organizers to sell directly on their page(s), and through Universe.
Smart social integrations & cross-sell effect boost ticket sales.
Sell more tickets through better tools :)
today's talk
How Universe uses AWS AutoScaling + Docker + Chef
for happy DevOps
GOAL: Share one possible infrastructure
architecture & the necessary tooling
ACTUAL GOAL: Inspiration!
Go home and play around.
CHALLENGES for devs
Different languages: Ruby, Node, Python
Different frameworks: Rails, Sinatra, Express, Flask
Each Microservice API runs slightly differently!
Jumping in requires learning each API's unique configuration
We achieved consistency through Containers
CHALLENGEs for ops
Ticketing business suffers from flash traffic surges
Need to scale microservice APIs independently
Microservice Architectures need to tolerate failure
It's the cloud -- sh*t happens!
We solve this through AWS AutoScaling Groups
Why containerize?
Fast & reliable server provisioning
Consistency in development & production
Dependencies are organized
(private keys, certificates, libs)
Changes to depencies can be tested & staged
5 tips when containerizing
1. All app servers should be ephemeral
2. Separate your app servers & databases
3. Encrypt your secrets!
4. Team should connect via a Jump Server
5. Get a wildcard SSL certificate
TL;DR
The Twelve-Factor App
http://12factor.net
(covers ~90%? of what you need)
the original idea:
fat containers
FAT CONTAINERS
"Container runs a fully-baked app server VM"
1. CI server builds a Docker image if build passes
2. CI server tags the image by git commit
3. To deploy, pull specific image by commit tag, and restart container.
.. didn't work well!
Way too heavy!
Containers aren't VMs.
CI container is ephemeral, so could not incrementally build the images.
Added 20m to each CI job!!
docker ADD, push, and pull are SLOWWW
a better idea:
thin containers
THIN CONTAINERS
"Container runs a fully-baked app server PROCESS"
DOES NOT include application code.
DOES NOT include application gems, node_modules, etc.
Instead, app code & vendorized libs reside on host instance's FS,
and exposed to container's process via shared volumes.
Containers are much lighter, and infrequently built.
docker tip: baseimage-docker
http://phusion.github.io/baseimage-docker
Provides Ubuntu 14.04 LTS as base system
Provides a correct init process
(init @ PID 1, not docker CMD)
Includes patches for open issues (apt incompatibilities, /etc/hosts, etc)
Helpful daemons & tools: syslog, cron, runit, setuser, etc.
docker tip: passenger-docker
https://github.com/phusion/passenger-docker
baseimage-docker
Ruby 1.9.3, 2.0.0, 2.1.5, and 2.2.0; JRuby 1.7.18 (optional)
JRuby + OpenJDK 8 from the openjdk-r PPA (optional)
Python 2.7, 3.0 (optional)
NodeJS 0.10 (optional)
nginx (optional)
Passenger 4 (optional)
example docker run
docker run --name web_staging -v /home/ubuntu/apps/web/staging:/home/app/web -w /home/app/web -p 80:80 uniiverse/web-staging
--name web_staging, makes it easy to work with running container
docker exec web_staging tail -f /some/log/file
-v <host path>:<container path>, mounts codebase on host
-w <container path>, sets the cwd to the app root
-p 80:80, expose nginx in container to port 80 on host VM
The challenge
Application code & packages are stored on host VM's FS
During deployment, changes happen to app code & packages
The app server process running within the Docker container needs to reload the app responsibly:
-
For HTTP, need zero downtime
-
For job workers, need safe restarts
ZERO-DOWNTIME RELOADS:
PASSENGER
Super easy to enable via passenger-docker image
Passenger handles app reloads automatically!
Supports Ruby, Node, Python
ZERO-DOWNTIME RELOADS:
OTHER WEB SERVERS
When not handled automatically,
manually issue a reload via webserver CLI
i.e. for puma
docker exec web_staging bundle exec pumactl reload
SAFE RESTARTS: SIDEKIQ, KUE, etc.
Send a SIGTERM, then SIGINT, then SIGKILL
rerun gem
launches your program, then watches the filesystem. If a relevant file changes, then it restarts your program.
i.e. sidekiq
docker run web_staging rerun --background "bundle exec sidekiq"
i.e. kue
docker run kaiju_staging rerun --background "node kue.js"
aws autoscaling in 1 slide
AUTO SCALING GROUP:
CloudWatch events
+
EC2 Launch Configuration
+
Elastic Load Balancer (optional)
== healthy EC2 instances
UNIVERSE AUTOSCALING
AUTO SCALING GROUP
1 per <app>_<environment>
(same as Docker images)
Defines which ELB is used for new instances
Defines min/max cluster size, and CloudWatch events which trigger scaling up/down
universe autoscaling
LAUNCH CONFIGURATION
Specifies machine type, disk config, etc.
Associates instance with IAM role
user_data.sh, executed on first boot
UNIVERSE AUTOSCALING
EC2 INSTANCES
Multi-AZ
Tagged with Name=<app>_<environment>
Runs the Docker container against app codebase
UNIVERSE AUTOSCALING
ELASTIC LOAD BALANCER
1 per <app>_<environment>
Terminates SSL via wildcard cert
New EC2 instances are automatically linked
(optional, only for HTTP services)
USER_data.sh
#!/bin/bash -v
# install pre-requisites
curl -L https://www.opscode.com/chef/install.sh | bash
DEBIAN_FRONTEND=noninteractive apt-get update -y
DEBIAN_FRONTEND=noninteractive apt-get install -y awscli
aws s3 --region=us-east-1 cp s3://uniiverse-bucket/uniiverse-validator.pem /etc/chef/
# write first-boot.json
(
cat << 'EOF'
{"run_list": ["role[boxoffice]"]}
EOF
) > /etc/chef/first-boot.json
# write client.rb
(
cat << 'EOF'
environment 'production'
log_level :info
log_location STDOUT
client_key '/etc/chef/uniiverse.pem'
chef_server_url 'https://api.opscode.com/organizations/uniiverse'
validation_client_name 'uniiverse-validator'
validation_key '/etc/chef/uniiverse-validator.pem'
EOF
) > /etc/chef/client.rb
# bootstrap via chef
chef-client -j /etc/chef/first-boot.json
USER_DATA.SH
1. installs Chef Client
2. writes Chef Client configuration to /etc/chef
(incl. Chef Role and Environment)
3. installs awscli (for S3 access)
4. S3 GET Chef private key via IAM role
(set in Launch Configuration)
5. runs Chef Client!
BOOTSTRAPPING AN inSTANCE
LET'S ZOOM IN ..
CHEF COOKBOOK: unii-base
1. Hardens a new instance (SSH configuration, firewalling, etc.)
2. Applies any security updates (HeartBleed, etc.)
3. Configures helpful daemons (fail2ban, log aggregation, etc.)
Consistently applied to all servers across any roles and environments
CHEF COOKBOOK: unii-DOCKER
1. Installs Docker daemon
2. Pulls private keys to auth with Private Repository
CHEF COOKBOOK: unii-app-server
1. Determines the `docker run` command
2. Pulls latest Docker image for <app>_<environment>
3. Downloads <app>_<environment>.tgz from S3 and extracts
4. Generates upstart configuration
5. Starts service, which launches Docker container
Zooming out again ..
RE-APPLY UNII-APP-SERVER!
HIPCHAT + HUBOT CHATOPS
Deployments are executed by "Uniiverse ☃"
DEPLOYMENT:
BEHIND THE SCENES
DEPLOY SCRIPT: 4 PHASES
1. BUILD
an app.tgz
2. UPLOAD
to S3 bucket
3. SYNC
with all app servers in cluster
4. MIGRATE
any data
STEP 1: BUILD
if Gemfile?
bundle install --path=vendor/bundle
(for vendorized gems)
if "app/assets/"?
RAILS_ENV=staging RAILS_GROUPS=assets bundle exec rake assets:precompile
(i.e. Rails assets)
if package.json?
NODE_ENV=staging make dist
(builds node_modules/, compiles coffeescript, etc)
STEP 2: UPLOAD
1. Creates app.tgz archive, including app codebase, vendorized gems, etc. from STEP 1.
2. Excludes certain stuff (logs, caches, temp directories, etc)
3. Uploads app.tgz to special S3 bucket accessible only by app servers via IAM role
STEP 3: SYNC
1. Discovers all EC2 instances tagged with
Name=<app>_<environment>
2. Issues 1 Ansible ad-hoc command to reapply chef cookbooks
ansible all -i server1,server2, -a "chef-client -j /etc/chef/app-server.json"
STEP 4: MIGRATE
if "db/migrate/"?
ansible all -i server3 -a "sudo docker exec web_staging bundle exec rake db:migrate"
(i.e. runs any Rails migrations)
Only runs on 1 randomly-chosen server in the cluster
./ssh.sh
quick CLI to SSH into any instance:
./ssh.sh <app> <environment> [command]
./ssh.sh web staging free -m
Resolves EC2 instances by
Name tag
Solves the problem of server discovery for remote access
Uses the Jump Server to access the instance
./attach.sh
quick CLI to SSH into any app container:
./attach.sh <app> <environment> [command]
./attach.sh web staging bundle exec rails console staging
./attach.sh web staging bundle exec rake cache:clear
Uses ./ssh.sh and docker exec
Simplifies connecting to the running container.
Perfect for opening an interactive console, etc.
BENEFITS
A consistent way to ship microservices in Containers despite underlying language, framework, or dependencies
Services are multi-region, can auto-scale with traffic, and auto-heal during failure
Developers have 1 way of shipping code, with gory details neatly abstracted. No longer daunting to add a new microservice.
Helpful tools and scripts can be written once and reused everywhere.
AREA OF OPTIMIZATION #1
OPTIMIZE COST:
Requires lots of ELB instances
(2 environments * N microservices)
ONE SOLUTION?
1 ELB & 1 ASG of HAProxy machines
incl. subdomain routing,
health detection, etc.
AREA OF OPTIMIZATION #2
OPTIMIZE UTILIZATION:
EC2 instances are single-purpose
(only run 1 docker container)
ONE SOLUTION?
Marathon: execute long-running tasks via Mesos & REST API
All instance CPU & RAM is pooled, and tasks (i.e. containers) are evenly distributed by resource utilization
AREA OF OPTIMIZATION #3
A FEW NEW POINTS OF FAILURE:
-
Hosted Chef
-
Dockerhub Private Repo
- Github
SOLUTION:
Build redundancy into the 3rd party services you rely on
IS THIS totally over-engineered?
We spiked working implementations of these awesome alternatives, all of which support Docker.
-
AWS Elastic Beanstalk
-
AWS OpsWorks
-
AWS EC2 Container Service
-
Google Container Engine (via kubernetes)
-
Marathon
We cherry-picked the parts we liked and rolled out own!
For us, it makes sense. YMMV
THANK YOU :)
https://universe.com
@AdamMeghji
adam@universe.com