Scalability


Skeepers Influence Workshop

Your HOsts Tonight


Alex Fernรกndez (pinchito)
Director of Eng, Influence
 
Platform Lead, Influence

Alfredo Lรณpez

What We Will See


What is Scalability?


Horizontal and Vertical Scaling


Scaling Strategies

๐Ÿš€ What is  scalability


Source

๐Ÿชœ Definition of scalability


The capacity to be changed in size or scale.

The ability of a computing process to be used or produced in a range of capabilities.

Lexico (Oxford Dictionary)

โ›ท๏ธ Scale Up and Down!

๐Ÿ“š Use in Literature

๐Ÿง Example: Linux

From embedded
to supercomputers

โœ‹ What makes something Scalable?


The right question is: why doesn't it scale?

  • Scarcity of a resource
  • Growing wait times
  • Unability to answer
  • Blocking on a resource
  • ...

๐Ÿ“ Exercise: non Scalable Service


Install Node.js

Install loadtest
$ npm install -g loadtest 

Run the command
$ loadtest https://gorest.co.in/public/v1/users -n 2000 -c 100 --keepalive
Write down the average for requests per second (rps),
average latency and number of errors

โฎฏ

๐Ÿ“ Exercise +


Adjust rps from 10 to 100 ยฑ10

$ loadtest http://service.pinchito.es:3000/a -n 2000 -c 100 --rps 30 -k
$ loadtest http://service.pinchito.es:3000/a -n 2000 -c 100 --rps 40 -k
...
Write down rps, latency and errors

Go up to 100, then 1000
$ loadtest http://service.pinchito.es:3000/a -n 2000 -c 100 --rps 1000 -k

โฎฏ

๐Ÿ“ Exercise +

Draw a graph with rps sent and result

Another graph with rps vs latency

โฎฏ

๐Ÿ“ Exercise +


Now test against:
loadtest https://www.google.com -n 2000 -c 100 -k

What is the difference?

How do latency and rps behave now?


โฎฏ

๐Ÿ‘ Success!


 ๐Ÿฅ› What Resource Run out?


Graph with CPU from AWS:

300 rps
400 rps
1000 rps

๐Ÿš‚ rps vs throughput



๐Ÿ“ˆ Scalability Profiles


Brendan Gregg: Systems Performance

๐Ÿš’ Latency vs rps



โš–๏ธ Little's Law

Little, 1952 - 1960

The average number of requests in flight L equals:
the rate of requests per second ฮป
multiplied by the average request time W.


Wikipedia

If we increase concurrency L,
the average time per request W grows proportionally.

โ‡•v and โ‡” h Scaling

๐Ÿง“ Hard Beginnings


IBM mainframe

๐Ÿ’ฝ Specialized Servers


๐Ÿ—„๏ธ The Usual Cabins

๐Ÿค– And then Google Arrived

Google storage cabin, 1996

โ‡• vertical Scaling


Buy a bigger machine


And bigger


Until you run out of machines


Hard to go back to a smaller machine ๐Ÿ˜…

โ‡• vertical Scaling


๐Ÿคซ Sshhh...

For many decades now, supercomputers are just...
clusters of smaller machines
IBM Blue Gene/P: 164k cores, 2007

โ‡” Horizontal Scaling


Use many similar machines for a given function

("provisioning")


Add or remove machines to scale


When one machine is failing it is removed from service

โ‡” horizontal Scaling


๐Ÿ“ Exercise: Storage


Design a corporate storage system with 15 TB


Option 1 โ‡•: storage area network (SAN)

Best option as of december 2008


Option 2 โ‡”: raw hard drives

Best option as of july 2009


โฎฏ

๐Ÿ“ EXERCISE +


Add controllers


RAID options (Redundant Array of Inexpensive Disks)


Measure the $ difference between option 1โ‡• and option 2โ‡”

Final price?


โฎฏ

๐Ÿ“ EXERCISE +


Consider redundancy strategies

Fault tolerance

Redundancy options: 2x, 3x, ?


Consider scaling strategies


How do they affect the price?


โฎฏ

๐Ÿ‘ success!



โ‡” Horizontal Strategies


๐Ÿคน Balancing (server-side)

๐Ÿ•ต๏ธ Balancing (client-side)

๐Ÿ’ Affinity

๐Ÿ”ฑ Independence

๐Ÿ‡ Clustering

๐Ÿ”‘ Sharding

๐Ÿงฌ Replication

โŒ› Queues

๐Ÿคน Server-side Balancing



Example: AWS ELB, Google Cloud Load Balancing

๐Ÿ•ต๏ธ Client-Side Balancing


Example: Facebook client
Chooses the API endpoint in the browser

Example: DNS balancing

๐Ÿ’ Affinity โ‡”


By cookie or geographical

Needs a sophisticated router
Client-side or server-side

๐Ÿ”ฑ Independence โ‡”


Neutral (or blind) balancing

๐Ÿ‡ Clustering โ‡”

Generic term "cluster":
create one machine out of many

In databases usually means having more than one server
all equivalent

๐Ÿ”‘ Sharding โ‡”


Balancing by key

Needs a sharding algorithm (usually with hashing)

๐Ÿงฌ Replication โ‡”


A primary server (read + write) and several replicas (read-only)

Useful when reading > writing

๐Ÿซ Active REPLICAtion โ‡”

Active-active, multiple primary...

Needs a conciliation algorithm

โŒ› Queues โ‡”


Production of tasks independent of consumption

Mechanism for polling
(NOT pooling ๐Ÿ™)

๐Ÿ“ Exercise: Scalable Storage


You work for search engine Fooble in January 2000

You have to store the search index

Design a scaling strategy

Assume index = page sizes

Use contemporary disk drives

โฎฏ

๐Ÿ“ EXERCISE +


10 KB per page

50 million pages

4 million searches per day

10 search terms max

Target time of 0.1 seconds per search

โฎฏ

๐Ÿ“ EXERCISE +


50M pages ร— 10 KB = 500 GB

Cheapest disk drive: Seagate ST317242A, 17.2 GB, $152

32 disks ร— 16 GB = 512 GB, $4864

8 servers ร— 4 disks = 32 disks

4M searches ร— 100 ms = 400k seconds = 4.6 servers
Adding peak time: at least 8 servers

โฎฏ

๐Ÿ“ EXERCISE +



100 ms for ~5 search terms
Average query time to storage < 20 ms
โฎฏ

๐Ÿ“ EXERCISE +


Query time: seek time + 1/2 turn + formatting

Seek time: ~8 ms
7200 rpm disk drive: 8 ms per turn
Query total: >12 ms

Seems doable; better add some caching


โฎฏ

๐Ÿ‘ Well done!



๐Ÿ“š Bibliography



Brendan Gregg: Systems Performance: Enterprise and the Cloud


John Allspaw: Web Operations: Keeping the Data On Time


HighScalability.com: Favorite posts on HighScalability