Docker: Testing to Prodution

Edwin Fuquen

efuquen@google.com

@efuquen

About Me

  • Queens, NY => Florida => Queens, NY
  • University of Florida 2004 to 2009
  • Grooveshark, Livestream, Getty Images, Bloomberg, and Google
  • Backend Development and Infrastructure
    • Server Administration to Distributed Systems
    • Some Frontend (mostly personal)
    • Python, JS/Node, Scala, Java

Managing a Datacenter

  • Mid 90's to early 2000's.
  • Expensive equipment
  • Specialized knowledge
  • Time consuming
  • Slow to increase capacity
  • Not very fun

Virtual Machine

  • Fully mimics an OS
  • Securely isolated from each other
  • Many virtual servers on the same bare metal
  • Large pool of servers can  many more VMs
  • Easily balance workloads

The Problem(s)

  • VMs need to be configured, complicated
  • As software still heavy weight and slow
  • Not good for application deployment
  • We solve the Ops problem, not the Dev one

Containers: A solution

  • A process runs in isolation but with the same OS
  • Does not mimic an entire machine
  • Done via two mechanisms
    • Namespaces - per process resource isolation
    • Cgroups - per process resource management
  • This provides a completely separate environment for an application without the weight of a virtual machine

Docker

  • User friendly command line interface to containers
  • Dockerfile - Rules describe what goes in a container
  • Layered File System - applies rules to FS, saving final image
  • Daemon - Tracks running containers and images

Docker

  • User friendly command line interface to containers
  • Dockerfile - Rules describe what goes in a container
  • Layered File System - applies rules to FS, saving final image
  • Daemon - Tracks running containers and images

Docker Architecture

CoreOS

  • Linux OS based on Gentoo distribution.
  • No package manager and few preinstalled tools.
  • The most essential being docker, etcd, & fleetd.
  • An OS fully built around managing containers in a distributed, fault tolerant cluster of machines.

etcd

  • A distributed key/value store.
  • Meant for config data, not high latency/throughput.
  • Strongly consistent, very reliable.

fleet

  • Uses etcd as distributed config store.
  • Runs distributed services on many nodes.
  • Uses standard linux service files, but with some customized options

Load Balancer

  • Application ip & port are registered in etcd.
  • confd gets notified when certain keys in etcd are modified
  • Then haproxy.cfg templates get updated with added or removed application ip & port.

Production Problems

  • Many competing cluster/cloud solutions
  • Docker, Layered Filesystems, and kernel features all very new. Still maturing and changing rapidly.
  • Logging is a mess.
  • Security
    • No isolation like VMs
    • Mistakenly store sensitive information in images.
    • Daemon requires privileged control.
    • Community has historically not focused on it.

The Future

  • Standards: appc, runc, and the open container initiative.
    • Will allow container alternatives.
  • Docker Compose for production deployments with Swarm
  • All the issues mentioned are actively being worked on by Docker & the community.

Questions?

Fullstack Docker

By Edwin Fuquen

Fullstack Docker

  • 1,125