Rancher

Ron Kurr

What

We'll be looking at Rancher, a recently released Docker deployment tool.  We'll also be briefly looking at RancherOS, one of the operating systems you can use to run Rancher.

Why

We need a tool that can help with our Docker deployments which accommodates our scheduling, monitoring and self-service needs. 

How

We'll have some slides that provide an overview but most of the presentation will be done via a live demonstration.

November 2015

Rancher is an open source solution that allows deployment of containers into a cluster of machines, which is becoming an increasingly common scenario. It provides services such lifecycle management, monitoring, health checks and discovery. Also included is a completely containerized operating system based on Docker. The broad focus on containerization and very small footprint are key advantages for Rancher. A similar solution in this space is Kubernetes.

Recommendation: Assess

April 2016

The emerging Containers as a Service (CaaS) space is seeing a lot of movement and provides a useful option between basic IaaS (Infrastructure as a Service) and more opinionated PaaS (Platform as a Service). While Rancher creates less noise than some other players, we have enjoyed the simplicity that it brings to running Docker containers in production. It can run stand-alone as a full solution or in conjunction with tools like Kubernetes.

Recommendation: Trial

What It Does

From the ground up, Rancher was designed to solve all of the critical challenges necessary to run all of your applications in containers. Rancher provides a full set of infrastructure services for containers, including networking, storage services, host management, load balancing and more. All of these services work across any infrastructure, and make it simple to reliably deploy and manage applications.

What It Does

Scheduling is at the heart of container management. With Rancher 1.0 we’ve brought together the two most popular container scheduling tools into a single platform. Users can now choose between Kubernetes and Swarm when they deploy environments. Rancher automatically stands up the cluster, enforces access control policies, and provides a complete UI for managing the cluster.

What It Does

Enterprise application catalogs are almost always awful. Templates rarely work, apps are out of date, and god forbid you ever want to upgrade. So when we decided to build an app catalog in Rancher, we rethought the entire experience from the ground up. Rancher’s app catalog stores app templates in native Docker Compose or Kubernetes files, and keeps them in a central Git repo. Catalogs can be private or public, and they allow users to configure exactly how they want their services deployed, and when they want them to upgrade.

What It Does

As an open-source platform, Rancher is popular with companies of all sizes. However, our largest users typically have a long list of requirements they need to make sure Rancher supports in order to satisfy auditors and security teams. We’ve made sure Rancher supports all of these, including role-based access control, integration with LDAP and Active Directories, detailed audit logs, high-availability management servers, encrypted networking, and of course the option to purchase enterprise-grade 24x7x365 support.

Key Features

  • Cross-host networking. Rancher creates a private software defined network for each environment, allowing secure communication between containers across hosts and clouds.

  • Container load balancing. Rancher provides an integrated, elastic load balancing service to distribute traffic between containers or services. The load balancing service works across multiple clouds.

Key Features

  • Persistent Storage Services. Rancher supports orchestrating Persistent Storage Services for Docker, making it possible for developers to deploy storage reliably in conjunction with containerized applications. The new feature builds on Docker 1.9 volume plugin capabilities, and makes it easier for developers to run applications that require stateful databases and persistent storage.

  • Service discovery: Rancher implements a distributed DNS-based service discovery function with integrated health checking that allows containers to automatically register themselves as services, as well as services to dynamically discover each other over the network.

Key Features

  • Service upgrades: Rancher makes it easy for users to upgrade existing container services, by allowing service cloning and redirection of service requests. This makes it possible to ensure services can be validated against their dependencies before live traffic is directed to the newly upgraded services.

  • Resource management: Rancher supports Docker Machine, a powerful tool for provisioning hosts directly from cloud providers. Rancher then monitors host resources and manages container deployment.

Key Features

  • Multi-tenancy & user management: Rancher is designed for multiple users and allows organizations to collaborate throughout the application lifecycle. By connecting with existing directory services, Rancher allows users to create separate development, testing, and production environments and invite their peers to collaboratively manage resources and applications.

  • Multi Orchestration Engines. Rancher supports the ability for users to select the default Cattle, Kubernetes, or Docker Swarm as their container orchestration engine of choice when creating environments. This will allow users to select market leading scheduling frameworks while still leveraging Rancher features such as the app catalog, enterprise user management, container networking, and storage technologies.

Interfaces

  • Users can interact with Rancher through native Docker CLI or API. Rancher is not another orchestration or management layer that shields users from the native Docker experience. As Docker platform grows over time, a wrapper layer will likely be superseded by native Docker features. Rancher instead works in the background so that users can continue to use native Docker CLI and Docker Compose templates. Rancher uses Docker labels–a Docker 1.6 feature contributed by Rancher Labs–to pass additional information through the native Docker CLI. Because Rancher supports native Docker CLI and API, third-party tools like Kubernetes work on Rancher automatically.

Interfaces

  • Users can interact with Rancher using a command-line tool called rancher-compose. The rancher-compose tool enables users to stand up multiple containers and services based on the Docker Compose templates on Rancher infrastructure. The rancher-compose tool supports the standard docker-compose.yml file format. An optional rancher-compose.yml file can be used to extend and overwrite service definitions in docker-compose.yml.

Interfaces

  • Users can interact with Rancher using the Rancher UI. Rancher UI is required for one-time configuration tasks such as setting up access control, managing environments, and adding Docker registries. Rancher UI additionally provides a simple and intuitive experience for managing infrastructure and services.

All The Pieces

RancherOS

RancherOS is the smallest, easiest way to run Docker in production. Everything in RancherOS is a container managed by Docker. This includes system services such as udev and rsyslog. RancherOS is dramatically smaller than most traditional operating systems, because it only includes the services necessary to run Docker. This keeps the binary download of RancherOS to less than 30 MB. The size may fluctuate as we adapt to Docker. By removing unnecessary libraries and services, requirements for security patches and other maintenance are dramatically reduced. This is possible because with Docker, users typically package all necessary libraries into their containers.

RancherOS

Another way in which RancherOS is designed specifically for running Docker is that it always runs the latest version of Docker. This allows users to take advantage of the latest Docker capabilities and bug fixes.

 

RancherOS

Everything in RancherOS is a Docker container. We accomplish this by launching two instances of Docker. One is what we call System Docker, which runs the latest Docker daemon as PID 1, the first process on the system. All other system services, like ntpd, rsyslog, and console, are running in Docker containers. System Docker replaces traditional init systems like systemd, and can be used to launch additional system services.

RancherOS

System Docker runs a special container called User Docker, which is another Docker daemon responsible for managing all of the user’s containers. Any containers that you launch as a user from the console will run inside this User Docker. This creates isolation from the System Docker containers, and ensures normal user commands don’t impact system services.

We created this separation because it seemed logical and also it would really be bad if somebody did docker rm -f $(docker ps -qa) and deleted the entire OS.

RancherOS

Rancher Server

  • Active Directory or OpenLDAP integration
  • MySQL for persistence
  • HA Configuration
  • Shared MySQL DB instance
  • Redis
  • Zookeeper
  • Load balancer to spread traffic across the Rancher instances
  • A host to run the websocket-proxy on.

Environment

All hosts and any Rancher resources, such as containers, load balancers, and so on are created in and belong to an environment. Access control permissions for viewing and managing these resources are then defined by the owner of the environment. Rancher currently supports the capability for each user to manage and invite other users to their environment and allows for the ability to create multiple environments for different workloads. For example, you may want to create a “dev” environment and a separate “production” environment with its own set of resources and limited user access for your application deployment.

Users

Users govern who has the access rights to view and manage Rancher resources within their Environment. Rancher allows access for a single tenant by default. However, multi-user support can also be enabled.

Hosts

  • Hosts are the most basic unit of resource within Rancher and is represented as any Linux server, virtual or physical
  • Any modern Linux distribution that supports Docker 1.9.1+
  • Ability to communicate with a Rancher server via http or https through the pre-configured port. Default is 8080.
  • Ability to be routed to any other hosts under the same environment to leverage Rancher’s cross-host networking for Docker containers
  • Rancher also supports Docker Machine and allows you to add your host via any of its supported drivers.

Networking

  • Rancher supports cross-host container communication by implementing a simple and secure overlay network using IPsec tunneling.  Most of Rancher’s network features, such as load balancer or DNS service, require the container to be in the managed network.
  • Under Rancher’s network, a container will be assigned both a Docker bridge IP (172.17.0.0/16) and a Rancher managed IP (10.42.0.0/16) on the default docker0 bridge. Containers within the same environment are then routable and reachable via the managed network.

Service Discovery

Rancher adopts the standard Docker Compose terminology for services and defines a basic service as one or more containers created from the same Docker image. Once a service (consumer) is linked to another service (producer) within the same stack, a DNS record mapped to each container instance is automatically created and discoverable by containers from the “consuming” service.

Service Discovery

  • Service High Availability (HA) - the ability to have Rancher automatically monitor container states and maintain a service’s desired scale.
  • Health Monitoring - the ability to set basic monitoring thresholds for container health.
  • Add Load Balancers - the ability to add a simple load balancer for your services using HAProxy.
  • Add External Services - the ability to add any-IP as a service to be discovered.
  • Add Service Alias - the ability to add a DNS record for your services to be discovered.

Load Balancer

Rancher implements a managed load balancer using HAProxy that can be manually scaled to multiple hosts. A load balancer can be used to distribute network and application traffic to individual containers by directly adding them or “linked” to a basic service. A basic service that is “linked” will have all its underlying containers automatically registered as load balancer targets by Rancher.

Health Checks

  • Rancher implements a health monitoring system by running managed network agent’s across it’s hosts to co-ordinate the distributed health checking of containers and services. These network agents internally utilize HAProxy to validate the health status of your applications. When health checks are enabled either on an individual container or a service, each container is then monitored by up to three network agents running on hosts separate to that containers parent host. The container is considered healthy if at least one HAProxy instance reports a “passed” health check
  • Rancher handles network partitions and is more efficient than client-based health checks. By using HAProxy to perform health checks, Rancher enables users to specify the same health check policy across applications and load balancers.

Service HA

  • Rancher constantly monitors the state of your containers within a service and actively manages to ensure the desired scale of the service. This can be triggered when there are fewer (or even more) healthy containers than the desired scale of your service, a host becomes unavailable, a container fails, or is unable to meet a health check.

Service Upgrades

  • Rancher supports the notion of service upgrades by allowing users to either load balance or apply a service alias for a given service. By leveraging either Rancher features, it creates a static destination for existing workloads that require that service. Once this is established, the underlying service can be cloned from Rancher as a new service, validated through isolated testing, and added to either the load balancer or service alias when ready. The existing service can be removed when obsolete. Subsequently, all the network or application traffic are automatically distributed to the new service.

Rancher Compose

  • Rancher implements and ships a command-line tool called rancher-compose that is modeled after docker-compose. It takes in the same docker-compose.yml templates and deploys the Stacks onto Rancher. The rancher-compose tool additionally takes in a rancher-compose.yml file which extends docker-compose.yml to allow specifications of attributes such as scale, load balancing rules, health check policies, and external links not yet currently supported by docker-compose.

Stacks

  • A Rancher stack mirrors the same concept as a docker-compose project. It represents a group of services that make up a typical application or workload.

Container Scheduling

  • Rancher supports container scheduling policies that are modeled closely after Docker Swarm.
  • In addition, Rancher supports scheduling service triggers that allow users to specify rules, such as on “host add” or “host label”, to automatically scale services onto hosts with specific labels.

Sidekicks

  • Rancher supports the colocation, scheduling, and lock step scaling of a set of services by allowing users to group these services by using the notion of sidekicks. A service with one or more sidekicks is typically created to support shared volumes (i.e. --volumes_from) and networking (i.e. --net=container) between containers.

Metadata Service

  • Rancher offers data for both your services and containers. This data can be used to manage your running Docker instances in the form of a metadata service accessed directly through a HTTP based API. These data can include static information when creating your Docker containers, Rancher Services, or runtime data such as discovery information about peer containers within the same service.

Demonstration

  • Try to simulate a TLO and Mold-E deployment
  • Running in the Amazon cloud
  • Services locked down by IT, such as MySQL, RabbitMQ, Graylog and MongoDB, are not managed by Rancher
  • Infrastructure services that are under our control are deployed to all containers as System Services
  • Alpha host runs the Rancher Server
  • Bravo host runs the locked down services
  • Charlie and Delta hosts run the simulation workload

Demonstration

Demonstration

  1. Create a new environment
  2. Talk about Docker Machine
  3. Show Infrastructure->Hosts
  4. Show monitoring console for a host
  5. Talk about the Catalog
  6. Show the contents of the Prometheus stack
  7. Add the TL Catalog
  8. Deploy Configuration Service (system service)
  9. Show containers spinning up
  10. Show container logs
  11. Show container exec
  12. Show hosts
  13. Deploy Reporting Service
  14. Show logs
  15. Deploy TLO
  16. Show logs
  17. Increase scaling
  18. Deploy load balancer
  19. Show containers
  20. Get Charlie's IP address
  21. Watch Reporting's log
  22. Send PUT request
  23. Show TLO API view
  24. Show the TLO upgrade
  25. Show rollback
  26. Show containers
  27. Note that the scale is back down to 1
  28. Poke around console

Cons

  1. minor UI bugs. Rollback, images in catalog not getting refreshed.
  2. to be safe, might want to run the Server in high availability mode, which means more infrastructure.  At the very least, point it to a stable MySQL instance that is backed up regularly
  3. testing was done with the proprietary Cattle scheduling module.  Unsure how solid Swarm or Kubernetes is.
  4. does not seem to support alerting when something is seriously wrong. Still need an alerting solution. DataDog?
  5. Not sure how we should handle an ever growing list of releases.  Maybe only keep N of releases in Git and prune the rest?

Pros

  1. puts lots of capabilities under one roof -- monitoring, scheduling, self-service deployment, integration with CI stream -- things we have to solve anyway
  2. supports non-Rancher schedulers.  We might want to use Swarm now and move to Kubernetes when we hit Netflix scale.  Cattle works but might be vendor lock in?
  3. Authentication and auditing support should make operations happy
  4. Possible to deploy to developer laptops, providing the same convenience and rollback capabilities
  5. Template Catalog is an awesome feature. Pick what you want to deploy and click.  CLI is also possible.
  6. Rancher load balancing is cheaper than Amazon's. Maybe AL can get rid of its internal proxies?
  7. Can integrate with Amazon's Route53 for DNS resolution
  8. Open source and the support community seems reponsive
  9. Thoughtworks suggests we trial it

Pricing

One of the beautiful things behind open source is that it the project’s control is in the hands of the community when all is said and done.  The data remains your data, free from lock-ins of proprietary solutions but built on repeatable standards (Apache License 2.0 / https://github.com/rancher/rancher ) .  Rancher is just that, a completely open source platform with over a million downloads and in production with Enterprise’s (including Federal) all over the world. (can send you some examples if you wish)

 

Additionally, we of course provide Enterprise Support/licensing (exact same code base) and I’m sorry that this was not very clear on the site.   

 

  •         Rancher is licensed based on the number of logical CPUs (LCPU) on Rancher hosts that are in use by a customer.  An LCPU includes a processor in a single core processor, a core in a multi-core processor, or a hyperthreading sibling. The total number of logical CPUs is determined by how they are reported by Linux in /proc/cpuinfo, on all hosts under the management of the Rancher server.

 

  •         There is a minimum purchase commitment of 2,000 LCPU’s across two support levels:

 

  •    License + Standard support: $50,000/year ($25/LCPU)
  •    License + Platinum support: $90,000/year ($45/LCPU)
  •         Additional discounts available for higher LCPU tiers and multi-year terms.

 

We can of course tailor this for you, so please do use me as a point of contact moving forward.  I’d be happy to set up a call to talk financials, our funding, etc…with one of the founders and myself.  Would Friday work by chance?

Next Steps

  • Do we continue with the current plan of using Rundeck + Amazon ECS to manage our deployments or do we want to course correct and trial Rancher?
  • Both systems use Docker Compose files
  • Using Rancher means we don't have to write the Rundeck/AWS integration piece
  • Using Rancher/Swarm gives us one less vendor-specific piece. Should be possible to switch to an entirely Swarm-based solution if Rancher doesn't work out
  • Rundeck/ECS doesn't give us the integrated feedback loop that Rancher does.  Once you deploy, you have to switch to another tool to see if your deployment "took".
  • Neither ECS or Rancher has specific support for rollbacks that involve database changes -- nobody seems to want to tackle that problem!

Rancher

By Ronald Kurr

Rancher

Overview of Rancher and RancherOS

  • 1,840