Managing Microservices Effectively
Daniel Hall (@smarthall)
About Me
- Systems Engineer at LIFX
- Making the 'Internet' in the Internet of Things
- Wrote Ansible Configuration Management by Packt
- Currently updating that book for a second edition
- Wrote RatticDB a team based open source password manager
About This Talk
- This is how we do things at LIFX
- Feel free to ask questions as we go
- It works for us, it might not work for you
- Think about how each bit fits into your situation
Step One: Write your apps
- You may not get input into this part
- Micro services are popular at the moment
- Design pattern that works with continuous delivery
Microservices
- Try to keep as much state outside your apps
- Don't make them too small, they're not nanoservices
- Don't make them too big, they're not milliservices
- Each service should be
- Replacable
- Independently Deployable
- Have a single capability (billing, authentication)
- Think about information flow and circular dependencies
The Hype Curve
Jeremy Kemp CC-BY-SA
(http://commons.wikimedia.org/wiki/File:Gartner_Hype_Cycle.svg)
Microservices
Step Two: Packaging
- All dependencies need to be available
- Needs to be small or cachable
- Faster install means faster deployments
- You might want multiple versions on the same machine
- Preferably it works in several environments
Docker
- Filesystem layers stacked on top of each other
- Uses Linux containers to isolate applications
- You can run a local Docker registry
- Security
- Speed
- You can run it locally in dev and on your servers
- Less of 'it works on my laptop'
- Minuscule performance hit compared to VMs
The Hype Curve
Jeremy Kemp CC-BY-SA
(http://commons.wikimedia.org/wiki/File:Gartner_Hype_Cycle.svg)
Docker
Step Three: Deployment
- As fast as possible
- Preferably minimal interaction
- Recovery from failures
Mesos/Marathon
- Mesos manages tasks running on a cluster
- Marathon coordinates long running jobs
- You submit a JSON job description to Marathon
- Marathon handles switching from the old app to new
- Marathon will also handle task failure and recovery
- Health checks ensure broken tasks get replaced
The Hype Curve
Jeremy Kemp CC-BY-SA
(http://commons.wikimedia.org/wiki/File:Gartner_Hype_Cycle.svg)
Mesos/Marathon
Extra Credit: Scheduling
- Made by AirBnB
- Some things need to run repeatedly
- Cron works, but its not really HA
- HA Crons exist but can be complex
- Your cluster probably has spare capacity
Chronos
- Chronos runs your scheduled tasks in Mesos
- Uses ISO8601 intervals to specify schedules
- Use your spare capacity for repeating tasks
- Can rerun failing jobs
- Can handle job dependencies
- Records stats on run times for jobs
The Hype Curve
Jeremy Kemp CC-BY-SA
(http://commons.wikimedia.org/wiki/File:Gartner_Hype_Cycle.svg)
Chronos
How we did it at LIFX
- All our services are stateless
- This made them all easy to Dockerise
- Mesos manages the resources
- Zookeeper helps Mesos choose a master
- Marathon makes sure daemons are running
- Chronos runs scheduled and repeating jobs
- Databases and other things storing state run outside
What does this look like?
Finding Things
- You have lots of microservices
- Marathon keeps moving them
- Whole machines are going up and down
- Where is this API running?
- Which copy of the API do I connect to?
Service Discovery
- etcd, consul, synapse
- Marathon comes with an example
- Marathon knows where things are running
- Uses HAProxy as load balancer to serivices
- You run HAProxy on every slave and configure everything to use localhost
- Not always perfect
- We use a custom script
- HTTP routing by putting hostnames in marathon metadata
Collecting Logs
- Docker currently has no great logging solution
- You can mount /dev/log but don't restart rsyslog
- Mesos collects stdout, stderr
- No easy way to access it
- No timestamps
- Correlating logs is great for debugging
Centralised Logs
- Make rsyslog log to 127.0.0.1
- Configure a queue to store messages, but drop if full
- Mount /dev/log into the container
- You'll need systemd
- Run several marathon logstash tasks
- Run elasticsearch on mesos (or seperately)
- Setup a few small nginx tasks running Kibana
- TADA! Centralised fault tolerant logs
What that looks like
Troubleshooting
- Similar to the service discovery problem
- Breaking in is easier than breaking out
- Logs inside the image can be hard to get to
Troubleshooting Techniques
- Find a container in Marathon
- Use docker exec to run a shell in the container
- Old versions of docker can use nsenter
- This won't work for a single executable container
- You also need tools in there
- Some debugging tools work from outside
- pprof for Go
- jconsole for Java
- gdb, strace for almost anything
Demo
https://github.com/smarthall/ansible-mesos
Demo Time!
- All the code is on Github
- https://github.com/smarthall/ansible-mesos
- https://github.com/smarthall/ansible-mesos
- 'vagrant up' will give you a development cluster
- './init-cluster.sh' will add some sample apps
Thankyou
Any Questions?
Managing Microservices Effectively - Microservices Meetup
By Daniel Hall
Managing Microservices Effectively - Microservices Meetup
- 2,161