The Grid
YARN & Mesos
Avishai Ish-Shalom (@nukemberg)
Fewbytes
What's a Grid
- A collection of computer nodes
- Shared workload
- General purpose
(Short) History of Grids
- Appeared mid 90's
- Commonly used in scientific computations
- "Distributed Supercomputer"
- BOINC
- Beowulf
Grid architecture
- Scheduler assigns tasks to nodes
- Assignment based on attributes (architecture) and load
- Unassigned tasks wait in queue
Grid responsibilities
- Schedule and distribute tasks to nodes
- Load balance nodes
- Retry failed tasks
- Monitor nodes
- Collect stdout, stderr
- Maintain job history
HPC Grids
- Offline batch jobs
- No API
- Simple, generic task placement
- non preemtive
- Slot based resource assignment
- No code distribution mechanism
- No isolation/containment
And then....
Google Borg
- In production circa 2003
- Online apps & Offline batch tasks
- Granular resource control
- Dense
- API, sophisticated scheduling logic
- Application specific logic
- Preemptive
And everyone followed
- Facebook - Tupperware
- Twitter - Mesos
- Yahoo - YARN
It aint easy
Input, output
- Binaries, data
- Copy to local
- Logs
Elastic workload
- Add/release resources dynamically
- Resume dead tasks
- E.g. MapReduce, stateless web apps
Rigid workload
- Can't change number of tasks
- Task termination halts job
- Partitioned data/state
- E.g. MPI, Sharded database
Task lifespan
- Short tasks easier to balance
- Preemption (!!!!)
- Task creation overhead
Automation
- Deployment*
- Process management
- Supervision
- Server management
Density
- Shared resources
- Complementary workloads
- Priorities
Abstraction
Why do you care about
- IP addresses
- Servers
- Disks
- Racks
YARN
One grid to rule them all
One grid to find them
Motivation
- MRv1 didn't scale
- Bad resource utilization
- Multitenancy
- Not only MR
History
- Hadoop On Demand successor
- Development started 2011
- Released in Hadoop 2.2
Architecture
Architeture (summary)
- Monolithic scheduler
- App specific callbacks (app master)
- Single (pluggable) executor
Mesos
One grid to bring them all
and in the darkness bind them
History
- Berkeley reasearch project (2008)
- Lots of input from the Google Borg guys
- Early adoption by Twitter and AirBnB (2010)
- Apache project since 2013
Architecture (cont)
Architecture (summary)
- Two stage scheduler
- Frameworks, resource offers
- Multiple executor
Fight!
YARN
- Data processing workloads
- Many data apps already support it
- Quirky Docker support
- Generic apps (Apache Slider)
Mesos
- Generic workloads
- Hadoop (MRv1)
- Docker support
- Spark, Storm
- Plenty of frameworks
- But some apps need extra coding to work
- No file distribution mechanism*
Workload
YARN
- Hadoop sidekick
- Partial docs
- Apache Slider
Mesos
- Independent product
- Mesosphere DCOS
- Good docs
- Aurora, Marathon, Chronos
- Project Myriad*
Ecosystem
YARN
- Quirky API
- Not well documented
- Monolithic
Mesos
- Good API
- Built to be extended
Extending
YARN
- You already have it ;-)
- Almost all big data apps support it out of the box
- Works pretty well for Hadoop
Mesos
- Generic workload
- Easy to extend
- Production workhorse
Bottom line
Right choice if you want a grid for everything
Right choice if you want a grid for data processing
To be continued...
- APIs
- Containers
- Behavior
- Demos
- Alternatives (?)
The Grid - YARN & Mesos
By Avishai Ish-Shalom
The Grid - YARN & Mesos
An introduction to grids and comparison of YARN & Mesos
- 2,178