Apache Mesos

resource sharing in the data center
@palmerabollo · 2015

Static partitioning

Apache Mesos abstracts CPU, memory, storage, and ports away from individual machines.

The cluster looks like one big computer.

Distributed resource scheduler

Architecture

A master process that manages slave daemons running on each cluster node.

Frameworks are software systems that run tasks on these slaves
(examples: Hadoop, Spark, Jenkins, Cassandra, ElasticSearch, etc)

Resource offers

Resource isolation

pluggable: cgroups, LXC, ...

CPU, memory, network, IO

Other

  • Offer rejection, Filters (node list, min resources)
  • Data locality: delay scheduling ?
  • Rate limits (qps)
  • Authorization (ACLs)

DEMO. Mesos Portal

Local installation (vagrant): https://github.com/mesosphere/playa-mesos

Datacenter Operating System :|
http://mesosphere.com/

Mesosphere

DEMO. Cluster on DigitalOcean

https://digitalocean.mesosphere.com/

Marathon

  • Scaling
  • HA
  • Node constraints (rack, host)
  • Application health checks (HTTP, TCP)
  • HTTP REST/JSON API
  • Event subscription
  • Docker support
  • ...

DEMO. Marathon Portal & API

Simple Task from UI

echo `date`; sleep 5;
curl $MARATON_API/v2/apps

Health check (HTTP)

curl -X POST -H "Content-Type: application/json" $MARATHON_API/v2/apps -d@health-check.json

Docker Container

curl -X POST -H "Content-Type: application/json" $MARATHON_API/v2/apps -d@redis.json
docker images, docker ps, ...

Haproxy

cat /etc/haproxy/haproxy.cfg (1 min auto refresh)

Flock

Low cost webapps in the cloud
PoC

https://github.com/flock-cloud

"PaaS" on Mesos/Marathon

one-click deploy

no usage = no resources reserved at all

deployments are slow

~90% of webapps are idle

Architecture

DEMO. Deploy apps with a github hook

Application descriptor (package.json)

  "flock": {
      "instances": 4,
      "cpus": 0.25,
      "mem": 32,
      "ports": 1
  },
  "checks": [
    {
      "path": "/",
      "intervalSeconds": 15,
      "maxConsecutiveFailures": 3
    }
  ]

DEMO. Autoscale apps

frontend http-in
    bind *:80
    acl acl_flock_demo_1_0_1 hdr(host) -i flock_demo_1_0_1.flock.com
    use_backend flock-demo-1.0.1 if acl_flock_demo_1_0_1


backend flock-demo-1.0.1
    balance leastconn
    http-request set-header X-Flock-App /flock-demo-1.0.1
    server server_1 10.132.73.13:31915 check
    server server_2 10.132.73.14:31677 check
    server server_backup_1 10.132.73.14:31006 check backup


backend flock-backup-1.0.0
    balance leastconn
    server server_1 10.132.73.14:31006 check

flock-scaler: downscale +  upscale: haproxy stats
                      http://$ANY_NODE_IP:9090/stats;csv

flock-backup: upscale (warm up): haproxy backup server

TODO

More resources

  • Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
    http://mesos.berkeley.edu/mesos_tech_report.pdf
  • An Introduction to Mesosphere.  https://www.digitalocean.com/community/tutorials/an-introduction-to-mesosphere
  • Google Borg. http://www.wired.com/2013/03/google-borg-twitter-mesos/all/
  • Microsoft Autopilot. http://research.microsoft.com/pubs/64604/osr2007.pdf
  • Google Kubernetes. http://kubernetes.io/
  • Apache Aurora. http://aurora.incubator.apache.org/
  • An Apache Mesos Framework Example. http://www.opencredo.com/2015/02/16/write-mesos-framework/