The Grid

YARN & Mesos

(part II)

Avishai Ish-Shalom (@nukemberg)

Fewbytes

Agenda

  • Containers
  • Basic grid services
  • Operations
  • Deep dive

Containers

  • A group of processes
  • Isolated
  • Resource constrained

Containers (Linux)

  • CGroups
  • Namespaces

Deep dive into YARN

Executor resource limits

  • CGroups (optional)
  • CPU (via CGroups)
  • Mem (via ContainerMonitor)
  • No limits for network, local disk
  • Kill misbehaving task

Container isolation

None, really

Log collection/aggregation

  • Container stdout/stderr saved locally
  • Copied to HDFS and aggregated
  • By default after application end
  • Can be periodic (>= 1hour)
  • Aggregation is optional

Artifact distribution

  • AKA Resource Localization
  • Copy artifacts to HDFS
  • Cache locally
  • Copy to container work dir

YARN Auxiliary services

  • Long running service
  • Per application
  • Not a task/container
  • E.g.
    • MR Shuffle
    • Spark Shuffle

Service discovery

  • You wish....
  • Appmaster knows its containers

YARN as General Purpose grid

YARN is built for data processing

  • Highly integrated with HDFS
  • Auxiliary services 
  • Hordes of short tasks
  • No affinity/anti-affinity, container groups, etc
  • No gang scheduling
  • Job starvation

YARN Operations

Daily tasks

  • Admin queues
  • Deal with problematic nodes
  • Deal with misbehaving jobs (!)

Misbehaving jobs

  • Can takeout entire node(s)
  • And the job never finishes
  • Never seen a job that kills the cluster
  • But it is possible

Service restarts

  • ResourceManager H/A
  • Persistent NodeManager
  • Container supervision recovery

YARN APIs

Actually 3 APIs

  • ResourceManager, NodeManager Client
  • Protobuf with HadoopRPC
  • SASL authentication
  • Java, Go (unofficial, no kerberos support)
  • Poll based
  • REST API is coming
  • Do yourself a favor and use Slider or Twill

Deep dive into Mesos

Containerizer resource limits

  • CPU (via cgroups)
  • Memory (via cgroups)
  • Local disk quota
  • Network rate limiting

Container isolation

  • Modular isolators
  • Chroot (0.24)
  • Pid namespace
  • Docker

Log aggregation/collection

  • Framework's problem
  • Collection on nodes
  • Redirect from server

Artifact distribution

  • Framework's problem until 0.23.0
  • Mesos fetcher (with local cache)
  • Marathon artifact store
  • Docker repo

Service discovery

  • Framework's problem
    (yeh yeh, i got the idea)
  • But Mesos helps - API & store
  • Marathon
  • Aurora

Mesos Operations

Daily tasks

  • Manage frameworks
  • Deal with problematic nodes

Service restarts

  • Multiple masters (quorum)
  • Persistent slave
  • Container supervision recovery

Mesos API

Actually N+1 APIs

  • Mesos master, Framework(s)
  • Master API - Protobuf, no auth, simple
  • HTTP based
  • Java, python, Go, C++ bindings
  • Marathon - HTTP REST
  • Aurora - Thrift

Other stuff you should know

  • Apache REEF
  • Docker Native grids:
    • Kubernetes
    • Docker Swarm
  • Grid vs PAAS

Final words:

With great power come
great problems

Made with Slides.com