Feeding locusts at

🦗 1K QPS 🦗

per restaurant

(while serving quality food)

Guillaume Gelin 

ramnes.eu 🇪🇺 🇫🇷


😊​

08:30 AM

Bob wakes up.

💺🖥️
🤗🖥️
💺🖥️
💺🖥️
09:00 AM

Bob arrives at his restaurant.

He is the one taking orders.

💺🖥️
😁🖥️🦗
💺🖥️
💺🖥️
10:00 AM

The first client comes in!

💺🖥️
🙂🖥️🦗🦗
💺🖥️
💺🖥️
11:00 AM

A queue starts.

💺🖥️
😱🖥️🦗🦗🦗🦗🦗🦗🦗🦗
💺🖥️
💺🖥️
11:30 AM

Locusts are hungry early.

🙂🖥️🦗🦗
😅🖥️🦗🦗
🙂🖥️🦗🦗
🙂🖥️🦗🦗
12:00 AM

Coworkers arrive! Fiuh.

12:30 AM

Queues are full: global panic!

😱🖥️🦗🦗🦗🦗🦗🦗🦗🦗
😭🖥️🦗🦗🦗🦗🦗🦗🦗🦗
😱🖥️🦗🦗🦗🦗🦗🦗🦗🦗
😱🖥️🦗🦗🦗🦗🦗🦗🦗🦗
12:35 AM

Crisis meeting

🤔
🤔  🤔
🤔


12:40 AM

Bob has an idea...

🤔
😀  🤔
🤔

12:41 AM

What if we used robots?


😀🤖 

😎🤖🦗🦗
😍🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
12:45 AM

Locusts are served much faster,
the queue starts to disappear

13:00 AM

But wait, no!
More and more locusts are coming in...

😱🤖🦗🦗🦗🦗🦗🦗🦗🦗
😭🤖🦗🦗🦗🦗🦗🦗🦗🦗
😱🤖🦗🦗🦗🦗🦗🦗🦗🦗
😱🤖🦗🦗🦗🦗🦗🦗🦗🦗

Scaling Python to

1K QPS

per server

(not doing Hello Worlds)

Quick summary

🦗 = ?
💺🖥️ = ?  
😊​ = ?
🤖 = ?

It's simple until you

make it complicated.

Step 0

Have something to scale

Have something to scale

😊
  • 4 CPU - 16 GB

  • Nginx :80 → :5000

  • Flask

  • SQLite

  • Flask-SQLAlchemy

  • Flask-WTF

  • Flask-Bcrypt

  • Register

  • Log in and log out

  • List tweets

  • Write new tweets

  • List all users

  • Follow and unfollow

Step 1

Identify threats and
define a protocol

Indentify threats

Reads

  • List tweets

  • List all users

Writes

  • Register

  • Log in and log out

  • Write new tweets

  • Follow and unfollow

Reads

  • List tweets

  • List all users

Writes

  • Register

  • Log in and log out

  • Write new tweets

  • Follow and unfollow

Define a protocol

  • Only one server (no horizontal scaling allowed)

  • Response time's 95% percentile should stay under 2 seconds

  • Failure rate should stay at 0%

  • Register 1x

  • Log in and log out 2x

  • List tweets 200x

  • Write new tweets 20x

  • List all users 5x

  • Follow and unfollow 5x

$ python run.py

50 QPS

Step 2

Parallelize

$ uwsgi --http :5000 --processes 4 --file run.py --callable app
$ gunicorn --bind :5000 --workers 4 run:app
$ cat tweepy.ini
[uwsgi]
master = true
protocol = http
socket = 0.0.0.0:5000
file = run.py
callable = app
processes = 4

$ uwsgi tweepy.ini

180 QPS

Stateful versus stateless

Step 3

Check the database usage

Leverage RAM

Paging is hard

but necessary

Explain

Transaction logs

250 QPS

Step 4

Cache reads

There are only two hard things in computer science: cache invalidation

and naming things.

tweets_cache = {}


@app.route("/tweets", methods=["GET"])
def get_tweets():
    ...
    tweets = tweets_cache.get(user_id)
    if not tweets:
        tweets = query_tweets(...)
        tweets_cache[user_id] = tweets
    ...


@app.route("/tweets", methods=["POST"])
def post_tweets():
    ...
    tweet.save()
    tweets_cache.pop(user_id)
    ...
@cache()
def query_tweets(user_id, ...):
    ...


@app.route("/tweets", methods=["GET"])
def get_tweets():
    ...
    query_tweets(user_id, ...)
    ...


@app.route("/tweets", methods=["POST"])
def post_tweets():
    ...
    query_tweets.invalidate(user_id)
    ...

But, the application is stateful now?

350 QPS

Step 5

Achieve concurrency

In the future:

asyncio + ASGI

CPU versus I/O

Databases writes

550 QPS

Step 6

???

😱🤖🦗🦗🦗🦗🦗🦗🦗🦗
😭🤖🦗🦗🦗🦗🦗🦗🦗🦗
😱🤖🦗🦗🦗🦗🦗🦗🦗🦗
😱🤖🦗🦗🦗🦗🦗🦗🦗🦗

Step 6

Scale vertically!

😎🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
😎🤖🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗🦗🦗
🦗🦗
🦗🦗
🦗🦗

 

🦗🦗
🦗🦗
🦗🦗
🦗🦗🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗
🦗🦗🦗🦗
🦗🦗
🦗🦗
🦗🦗

1K QPS

But remember...

Thank you!

Scaling Python to 1K QPS per server, not doing Hello Worlds

By Guillaume Gelin

Scaling Python to 1K QPS per server, not doing Hello Worlds

I will present a sample web application inspired from the real world (so not an application doing hello worlds) and showcase several ways of scaling it up, layer after layer, doing benchmarks at every step, up to 1000 queries per second — or 86.4 millions per day — on one Amazon server.

  • 1,275