Distributed Systems Expectation vs Reality

Alan Braithwaite

@caust1c

https://abraithwaite.net

Segment Inc

Who Am I?

@caust1c

What are Distributed Systems?

@caust1c

Image by Jan Vašek

?

?

Distributed Systems We Use

Distributed Systems
Crash Course

@caust1c

Concepts

@caust1c

High Availability

Consistency

Scalability

High Availability

@caust1c

High Availability

@caust1c

High Availability

@caust1c

High Availability

@caust1c

Consistency

@caust1c

  • How do we handle hosts coming back online?
  • Where do we write data to?
  • Multiple Connecting Clients?

Consistency

@caust1c

  • How do we handle hosts coming back online?
  • Where do we write data to?
  • Multiple Connecting Clients?

Single Writer Principle

 

Consistency

@caust1c

  • How do we handle hosts coming back online?
  • Where do we write data to?
  • Multiple Connecting Clients?

Single Writer Principle

Autonomous Leader Elections

Sharding

Sharding

Sharding

  • Scale Out instead of Up
  • Data Management & Shuffling
  • Single Writer Principle within each shard
    • Autonomous Leader Selection
    • zookeeper & etcd most common

Linearizability

@caust1c

https://jepsen.io/consistency/models/linearizable

Every operation on an object appears to take place atomically, in some order, consistent with the real-time ordering of those operations.

A Note About CAP Theorem

@caust1c

A Note About CAP Theorem

@caust1c

  • Consistency
  • Availability
  • Partition tolerance

A Note About CAP Theorem

@caust1c

  • Consistency
  • Availability
  • Partition tolerance

Expectation vs Reality

@caust1c

Expectation

Reality

Reality

@caust1c

  • HA is standard now

Reality

@caust1c

  • HA is standard now
  • Operations Overhead

Reality

@caust1c

  • HA is standard now
  • Operations Overhead
  • Pushes Complexity Clientside

Reality

@caust1c

  • HA is standard now
  • Operations Overhead
  • Pushes Complexity Clientside
  • Hot Shard Problem

Reality

@caust1c

  • HA is standard now
  • Operations Overhead
  • Pushes Complexity Clientside
  • Hot Shard Problem
  • Late Arriving Data Problem

Reality

@caust1c

  • HA is standard now
  • Operations Overhead
  • Pushes Complexity Clientside
  • Hot Shard Problem
  • Late Arriving Data Problem
  • Disk Failures

Reality

@caust1c

Reality

@caust1c

Advice

@caust1c

General Advice

@caust1c

  • Learn Networking
    • TCP, DNS, HTTP and basic routing at a minimum
  • Dr. Kredo's advanced networking is great
  • Beej's Guide to Network Programming (another alumni!)
  • Reboot CSLUG!  Chico State Linux User's Group
  • Install Linux on Everything
  • Homelab!
    • reddit.com/r/homelab

General Advice

@caust1c

https://www.reddit.com/r/homelab/comments/8vm9vm/raspberry_pi_datacenter/

Doesn't take a lot of money or a fancy lab
(although those things do help)

Job Advice

@caust1c

  • Everything on the previous slide
  • Attend (virtual) meetups, talk to people
  • Read Hacker News (but not the comments)
    • HN Who's Hiring Threads (1st Monday Monthly)
  • Read tech blogs & papers
    • https://jepsen.io/
    • Papers We Love
  • Open Source not necessary but definitely does help!

https://xkcd.com/979/

Distributed Systems Expectation vs Reality

Alan Braithwaite

@caust1c

https://abraithwaite.net

Segment Inc

Distributed Systems Expectation Vs Reality

By Alan Braithwaite

Distributed Systems Expectation Vs Reality

  • 850