Cluster!

A Case Study
 
 
Ryan Walls
@ryanwalls
 
 
 

The plan

  • Talk about our problems
  • Discuss and play with contenders for solving our problems
  • The Decision
  • Wrap up with some general advice for you

Our use case



Web front end

Backing API

Pipeline of workers

ProblemS to Requirements




Problem


Deployments = 

REquirement

 
 
Must support blue/green or rolling deployments, preferably both

Problem

Servers die spontaneously

Requirement



Must support auto-healing

Problem



GPU instances are expensive, costing us ~$500/month each


Requirement

 
Must support dynamic scale-in of hosts and containers

Problem


Long running jobs (several days) are stopped when deploying new code

Requirement



Must keep machines/containers from being terminated before jobs complete during deployments or scale-in

Problem


Users jobs can queue for a long time if other jobs are already running

Requirement


Must support dynamic scale-out of hosts and containers

Problem


Some of our customers


Requirement


Must run in AWS GovCloud

our requirements so far...

 
  • Must support blue/green or rolling deployments, preferably both
  • Must support auto-healing
  • Must support dynamic scale-in of hosts and containers
  • Must keep machines/containers from being terminated before jobs complete during deployments or scale-in
  • Must support dynamic scale-out of hosts and containers
  • Must run in AWS GovCloud

The contenders


AWS EC2 Container Service

THE CONTENDERS

The Contenders


Docker Swarm (with Consul and Docker Compose)
  

The Contenders

 
Why not Mesos/Marathon?
More flexibility (read complication) than we needed.  Would be good if you needed to manage non-containerized applications.
 
Why not CoreOS/etcd/Fleet/Flannel (or pay for Tectonic)?
Nothing wrong with this stack... Kubernetes uses etcd/flannel.
Just don't think the "CoreOS way" will win the container wars.
 
 

Contender #1

 
AWS EC2 Container Service

WHAT IS IT?

Amazon's native solution for managing containerized applications across multiple hosts in a cluster.

 

Provides mechanisms for application deployment, scheduling, updating, maintenance, and scaling.

 

Provided out of the box on AWS, cannot use on non-AWS systems.  No extra cost.

 

1.0 release ~ July 2015

Key features

Container auto-recovery

Container load balancing with ELBs

Zero downtime deploys

Integrates with Docker Compose files

Integrates with CloudWatch monitoring and CloudTrail logging

Nice UI for management with AWS Console

Setup

  1. Setup AWS account
  2. Get IAM credentials and set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in your environment
  3. Install aws cli

Demo

Pros

  • Built into AWS - tight integration
  • Easy to understand
  • No management of tool itself
  • Good documentation
  • Works with Docker compose
  • Nice GUI to play with

Cons

  • AWS only
  • Prebuilt ami doesn't support Docker networking yet (introduced in 1.9)
  • Not available in all regions.
  • Docker compose support is early days
  • ELBs only allow mapping to a single port for all container instances (therefore, 1 task per instance)

How it Fairs against our requirements...

  • Must support blue/green or rolling deployments, preferably both
    • ​Rolling deployments out of the box.
  • Must support auto-healing
    • Yes, services out of the box
  • Must support dynamic scale-in of hosts and containers
    • Combined with AWS Auto-scaling, yes.  link
  • Must keep machines/containers from being terminated before jobs complete during deployments or scale-in
    • Yes, with new instance protection on AWS.  link
  • Must support dynamic scale-out of hosts and containers
    • Yes, see scale-in above
  • Must run in AWS GovCloud
    • No

Good links

Contender #2

What is iT?

http://kubernetes.io/v1.1/docs/user-guide/overview.html

 

Open source system for managing containerized applications across multiple hosts in a cluster.

 

Provides mechanisms for application deployment, scheduling, updating, maintenance, and scaling.

 

Provided out of the box on Google Computer Engine, but can be used on any system

 

1.0 release: July 2015

Key Features

Opinionated framework that handles a lot for you

Supports Docker and Rocket containers

Everything runs inside Pods that are managed by replication controllers -- monitoring state and restarting/creating as necessary

Uses labels to handle grouping

Every pod gets its own virtual IP address, networking is flat

Pods can be grouped into services that gives one IP address for the group and a DNS name

Kube-proxy keeps track of pod changes for a service

 

 

 

 

Setup

http://kubernetes.io/v1.1/docs/getting-started-guides/README.html

AWS: 

export KUBERNETES_PROVIDER=aws; wget -q -O - https://get.k8s.io | bash

Local: Can run in Docker

GCE: Provided 

Anything else: Complicated

 

DEmo

Pros

  • Run on any platform
  • Pretty good documentation
  • Handles almost everything required out of the box (does not handle scaling hosts) 
  • Can use on top of other clustering platforms such as Mesos
  • Every container in a pod can talk on local host
  • Google

Cons

  • Many moving parts.  Setup is either hard or opaque unless on GKE
  • Concepts are very unique
  • No support of native Docker APIs

How it Fairs against our requirements...

  • Must support blue/green or rolling deployments, preferably both
    • ​Rolling deployments out of the box.
  • Must support auto-healing
    • Yes, services out of the box
  • Must support dynamic scale-in of hosts and containers
    • Combination of auto-horizontal scaling and AWS Auto-scaling - doable but complex
  • Must keep machines/containers from being terminated before jobs complete during deployments or scale-in
    • Yes, with new instance protection on AWS.  link
  • Must support dynamic scale-out of hosts and containers
    • Yes, see scale-in above
  • Must run in AWS GovCloud - Yes!

Contender #3

 
Docker Swarm (with Consul and Docker Compose)
  
 

What is it?

Docker's native clustering platform 

https://docs.docker.com/swarm

 

Simple tool for clustering a group of hosts that supports to full Docker API

 

1.0 release in Nov 2015

Key Features

Supports full docker API which means any existing docker tools can work with the cluster

 

Integrates with Docker machine for easy swarm creation, Docker compose, and Docker networking (introduced in 1.9)

 

Can schedule container placement based on "spread", "binpack", or "random" as well as filters such as general constraints (OS, kernel version, etc), health, affinity, dependency, or port

 

"Batteries removable" philosophy - users choose what to add/subtract or how to use

Setup

 

Download docker toolbox:

https://www.docker.com/docker-toolbox

 

Run "docker-machine" to create instances

 

 

 

 

Demo

Pros

  • Run on any platform
  • Each component is simple to understand and deploy
  • Full Docker tooling support
  • Future proof... Docker

Cons

  • Have to compose/build a lot of solutions yourself
  • Auto-healing, monitoring - Consul watchers -> update scale in docker-compose
  • Lots of flux
  • Rely on external tool for service discovery
  • Less mature
  • No "out of the box" solution compared to Kubernetes on GKE and ECS on AWS

How it Fairs against our requirements...

  • Must support blue/green or rolling deployments, preferably both
    • Can do it.  But we have to build it using Consul, HAProxy, etc.
  • Must support auto-healing
    • Can do it... but we have to build it using Consul.  Docker will auto restart if container dies.  
  • Must support dynamic scale-in of hosts and containers
    • AWS Auto-scaling will handle this
  • Must keep machines/containers from being terminated before jobs complete during deployments or scale-in
    • Yes, with new instance protection on AWS.  link
  • Must support dynamic scale-out of hosts and containers
    • Can do it... but we have to build it using Consul health checks.
  • Must run in AWS GovCloud - Yes!

Good links

Decision Tree

Are you running in GKE?  Yes, Kubernetes

Are you running in AWS EC2 exclusively?  Yes, ECS

If no to both.... it's complicated.  

Do you want a lot of power now? Yes --> Are you willing to build a lot of your own plumbing?  Yes, Docker Swarm, No, Kubernetes

Do you want a one framework solution?  Yes? Kubernetes.

Do you want the most future proof solution?  Yes, probably Docker Swarm

 

Wondering what we picked?  I'll tell you next month.

Made with Slides.com