Elastic Load Balancing with
Stephen Salinas
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445842/docker.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445845/mesos_logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
+
@shsalinas2012
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
- Who are we?
- What is HubSpot?
- HubSpot's architecture
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445997/sproket.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446007/mesoslogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
- Load balancing at HubSpot
Before...
And after...
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446052/dockerlogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446007/mesoslogoonly.png)
Agenda
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
HubSpot
- An All-In-One marketing (and sales) platform
Stephen Salinas (ssalinas@hubspot.com)
- Infrastructure Engineer on the HubSpot PaaS team
PaaS (Platform as a Service)
- Manage build and deploy
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446110/stephen.jpeg)
- Developer productivity
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445997/sproket.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
PaaS At HubSpot
- Build and deploy support for...
- 1484 Deployable artifacts
- 150+ Production deploys every day
- ~8900 Req/sec accross our load balancers
- 100+ developers
- 250+ web services / 200+ static apps
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
The HubSpot Deploy Infrastructure
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
Singularity
- HubSpot's open source mesos scheduler
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446537/deployrocket.png)
Deploy Service
- Stores build and deploy data
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446435/orion.jpg)
Orion
- Deploy UI/API, talks to Singularity + Baragon
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
Baragon
- Open source API for manipulating load balancers
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
The HubSpot Infrastructure
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446409/github.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446417/jenkins.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446421/s3.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446435/orion.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446007/mesoslogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446519/computerdog.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446531/deploy.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446537/deployrocket.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
app.hubspot.com/hello
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446584/elb.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1458282/computerkid.jpg)
app.hubspot.com
/hello
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446007/mesoslogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
'app'
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1455234/zookeeper_logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446584/elb.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446440/server.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446007/mesoslogoonly.png)
/hello
app.hubspot.com
- /hello
- app
- host:port
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
How Can We Improve?
- Decrease number of servers
(almost 200 at HubSpot)
- Speed up replacement
1) condense a new instance
2) configure instance
3) wait for puppet to run...
...
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449246/turtle-slow.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449078/dollars.jpeg)
-
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449078/dollars.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449078/dollars.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449078/dollars.jpeg)
- Mesos is great for app reliability, why not load balancers?
VS
1) Start new mesos task
- Efficient resource usage
(individual machines were using 50% or less of available memory)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
How Do We Get There?
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446007/mesoslogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446052/dockerlogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Main Problems to Solve
- How do I docker?
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446656/noidea.jpg)
- Can we smush it all into an image?
- Restarts must be seamless
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448355/containersmush.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448358/restart.png)
- Run nginx on a consistent port for the ELB
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446584/elb.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Seamless Restarts
- Baragon knows about...
- Agent starting up
- Agent shutting down
- How long since agent was last seen as 'active'
- Keep instances with inactive Baragon agents out of the ELB, add new ones the moment they are ready
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449252/Thumbs_up.jpeg)
- The current 'state of the world'
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Singularity and Docker
- Singularity custom mesos executor
- Embedded artifacts
- S3 download service
- thread count monitoring
- logrotate
- S3 upload service
**Can't get all this and use mesos docker containerizer
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448474/sadpuppy.jpg)
- Use mesos' docker containerizer?
- Containerizer vs Executor
- custom executor writes to files outside of the mesos
sandbox for the s3 uploader to use
Slave Machine
Containerizer
Executor
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Docker Support for Singularity Executor
- Container lifecycle
- Pull create start attach stop remove
- Map ENV, volumes, and ports
- Resource Monitoring
- Thread count
docker inspect to find the root pid
- CPU / memory usage
- mesos looks at cgroup of the executor
- manually moving processes around cgroups...
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448605/badpokerface.jpg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448619/ohgodwhy.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448668/dockerman.png)
Docker 1.6 FTW!
use --cgroup-parent to run as a child of the executor cgroup
enable hierarchy in the executor cgroup for proper reporting
with upgraded registry, image pulls are super fast too!
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
So What's In This Image???
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448768/rodan.png)
functionality
management
monitoring
Yes, we like our Kaijū on the PaaS team...
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Attempt #1
- Lets mimic what the server looks like!
- nginx
- java
- python
- yum install ...
- yum install ...
- yum install ...
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448803/stuffed.jpg)
2GB+ image...
- baragon app
- rodan app
- configuration
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Put That Image on a Diet...
Templating start script
distributed deploy updater
~450MB !
Minimal library installs
centos base image
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1458998/supervisor.gif)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446537/deployrocket.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Distributed (Inverse) Deploys
- Small python package already used in other places at HubSpot
- Watches the Deploy Service for new deploys
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446537/deployrocket.png)
- Downloads artifacts and starts the app
- Don't need to store app artifacts in the image
- Can update the managing / monitoring apps without
unnecessarily interrupting nginx
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1448768/rodan.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Managing Ports
- ELBs expect nginx to always be on the same port
- Mesos is more dynamic, assigns free port(s) within in a range
- Strategy
- Terminate SSL at the ELB (only 1 port needed for nginx)
- each 'group' is assigned a port outside the mesos range
- nginx also listens on mesos-assigned port for healthcheck
- baragon listens on mesos-assigned port
- ensure no two of same group run on the same slave
- Docker port mapping (EXPOSE / -P) would be consistent
inside the container, but not outside
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446584/elb.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446029/nginx-logo.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Time To Profit!
- Reliability
- No more getting paged for hardware issues
- Remove (most) human intervention
- Scalability
- Extra LB capacity is one click away
- Fast replacement (10s of seconds vs 10s of minutes)
- Cost
- Run mixed in with existing infrastructure
- Extra resources on big mesos slave machines
- Better utilization of available resources
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449049/profit.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449078/dollars.jpeg)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449087/scalable_img.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449105/reliable.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
Check It Out!
Baragon - github.com/HubSpot/Baragon
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446575/baragon.jpeg)
Working example of...
- mesos cluster with Singularity
- BaragonService + BaragonAgent/Nginx
...using docker-compose!
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446006/singularity.jpg)
Singularity- github.com/HubSpot/Singularity
- SingularityExecutor docker support is not officially released... yet...
...Or come talk to us!
ssalinas@hubspot.com
tpetr@hubspot.com
hubspot/baragonservice
hubspot/baragonagent
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446052/dockerlogoonly.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1446052/dockerlogoonly.png)
hubspot/singularity
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1445849/HubSpot_logo-14.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/336369/images/1449159/dog_question.jpg)
Questions???
Elastic Load Balancing with Docker and Mesos
By Stephen Salinas
Elastic Load Balancing with Docker and Mesos
- 4,649