Elastic Load Balancing with
Stephen Salinas



+
@shsalinas2012

- Who are we?
- What is HubSpot?
- HubSpot's architecture




- Load balancing at HubSpot
Before...
And after...




Agenda


HubSpot
- An All-In-One marketing (and sales) platform
Stephen Salinas (ssalinas@hubspot.com)
- Infrastructure Engineer on the HubSpot PaaS team
PaaS (Platform as a Service)
- Manage build and deploy

- Developer productivity


PaaS At HubSpot
- Build and deploy support for...
- 1484 Deployable artifacts
- 150+ Production deploys every day
- ~8900 Req/sec accross our load balancers
- 100+ developers
- 250+ web services / 200+ static apps

The HubSpot Deploy Infrastructure

Singularity
- HubSpot's open source mesos scheduler

Deploy Service
- Stores build and deploy data

Orion
- Deploy UI/API, talks to Singularity + Baragon

Baragon
- Open source API for manipulating load balancers

The HubSpot Infrastructure













app.hubspot.com/hello


app.hubspot.com
/hello










'app'













/hello
app.hubspot.com
- /hello
- app
- host:port

How Can We Improve?
- Decrease number of servers
(almost 200 at HubSpot)
- Speed up replacement
1) condense a new instance
2) configure instance
3) wait for puppet to run...
...


-



- Mesos is great for app reliability, why not load balancers?
VS
1) Start new mesos task
- Efficient resource usage
(individual machines were using 50% or less of available memory)

How Do We Get There?





Main Problems to Solve
- How do I docker?

- Can we smush it all into an image?
- Restarts must be seamless


- Run nginx on a consistent port for the ELB


Seamless Restarts
- Baragon knows about...
- Agent starting up
- Agent shutting down
- How long since agent was last seen as 'active'
- Keep instances with inactive Baragon agents out of the ELB, add new ones the moment they are ready

- The current 'state of the world'


Singularity and Docker
- Singularity custom mesos executor
- Embedded artifacts
- S3 download service
- thread count monitoring
- logrotate
- S3 upload service
**Can't get all this and use mesos docker containerizer

- Use mesos' docker containerizer?
- Containerizer vs Executor
- custom executor writes to files outside of the mesos
sandbox for the s3 uploader to use
Slave Machine
Containerizer
Executor

Docker Support for Singularity Executor
- Container lifecycle
- Pull create start attach stop remove
- Map ENV, volumes, and ports
- Resource Monitoring
- Thread count
docker inspect to find the root pid
- CPU / memory usage
- mesos looks at cgroup of the executor
- manually moving processes around cgroups...




Docker 1.6 FTW!
use --cgroup-parent to run as a child of the executor cgroup
enable hierarchy in the executor cgroup for proper reporting
with upgraded registry, image pulls are super fast too!

So What's In This Image???



functionality
management
monitoring
Yes, we like our Kaijū on the PaaS team...

Attempt #1
- Lets mimic what the server looks like!
- nginx
- java
- python
- yum install ...
- yum install ...
- yum install ...

2GB+ image...
- baragon app
- rodan app
- configuration

Put That Image on a Diet...
Templating start script
distributed deploy updater
~450MB !
Minimal library installs
centos base image




Distributed (Inverse) Deploys
- Small python package already used in other places at HubSpot
- Watches the Deploy Service for new deploys

- Downloads artifacts and starts the app
- Don't need to store app artifacts in the image
- Can update the managing / monitoring apps without
unnecessarily interrupting nginx



Managing Ports
- ELBs expect nginx to always be on the same port
- Mesos is more dynamic, assigns free port(s) within in a range
- Strategy
- Terminate SSL at the ELB (only 1 port needed for nginx)
- each 'group' is assigned a port outside the mesos range
- nginx also listens on mesos-assigned port for healthcheck
- baragon listens on mesos-assigned port
- ensure no two of same group run on the same slave
- Docker port mapping (EXPOSE / -P) would be consistent
inside the container, but not outside



Time To Profit!
- Reliability
- No more getting paged for hardware issues
- Remove (most) human intervention
- Scalability
- Extra LB capacity is one click away
- Fast replacement (10s of seconds vs 10s of minutes)
- Cost
- Run mixed in with existing infrastructure
- Extra resources on big mesos slave machines
- Better utilization of available resources





Check It Out!
Baragon - github.com/HubSpot/Baragon

Working example of...
- mesos cluster with Singularity
- BaragonService + BaragonAgent/Nginx
...using docker-compose!

Singularity- github.com/HubSpot/Singularity
- SingularityExecutor docker support is not officially released... yet...
...Or come talk to us!
ssalinas@hubspot.com
tpetr@hubspot.com
hubspot/baragonservice
hubspot/baragonagent


hubspot/singularity


Questions???
Elastic Load Balancing with Docker and Mesos
By Stephen Salinas
Elastic Load Balancing with Docker and Mesos
- 4,764