Zootopia!
Traffic
Usual days
Traffic
Usual days
BBD
Problem
Scaling the Modern Web App to 100x!
But first, let's see the modern web stack!
Top level Load balancer (Nginx)
Front End Load balancer (Nginx)
PM2
Node Instances
flipkart.com
API boxes, mSite etc.
n instances
PM2
Node Instances
..........................
Traffic
BBD
All good you think? But we're still stuck at 10x!
Bottleneck:
Network Bandwidth
Top level Load balancer (Nginx)
Front End Load balancer (Nginx)
Node
Content
1MB
Let's understand this with a scenario
Your target: 10,000 qps
Response payload: 1MB
Total bandwidth required: 10,000 MB/s
1GB = 1000MB*
So do the math :P
For 10,000 qps we need
10GBps
1 GBps line
Node
Content
1MB
...... n instances
Assume,
Remember!
- Small initial payload
- Be extremely stingy, network wise
- Anything extra degrades the performance
- And we all have seen this #hashtag so often #perfmatters :P
Meanwhile Road Widening happening in Zootopia...
Uh! Wait!
-
What if I can't buy more bandwidth?
-
And, I can't cut down on the content because you know, uh Product Managers :P
TA DA! Use Compression!
GZIP it!
But, before I tell you the-big-deal!
I like to create suspense and build-up :P
How many of you have written this?
var compression = require('compression');
var express = require('express');
var app = express();
app.use(compression());
Raise your hands please!
And just exactly how many times we've read or heard...
Node
is
SINGLE THREADED
Always
- Let node take care of just the application specific tasks, always! (No GZIPing at Node, never!)
- Let a reverse proxy do the Gzip compression!
- Read "Performance Best Practices" on Express's official site (if you do use Express like us!).
Updated Architecture
Gzip at reverse proxy (Nginx)
Top level Load balancer (Nginx)
Front End Load balancer (Nginx)
PM2 managed Node instances
PM2 managed Node instances
.................. n instances
Now, also GZIPing for us
Pumped up with our latest finding we ran the load tests again...
Thinking nothing can put us down now...
Traffic
BBD
And...we're able to go till 28x
Why?
Top level Load balancer (Nginx)
Front End Load balancer (Nginx)
PM2 managed Node instances
PM2 managed Node instances
.................. n instances
GZIPing
BOTTLENECK!
Co-hosted Node and Nginx
Top level Load balancer (Nginx)
Front End Load balancer (Nginx)
PM2 managed Node instances
PM2 managed Node instances
.................. n instances
GZIPing
Nginx
Nginx
Co-hosted Nginx and Node on the same box
We ran the load tests again...
Traffic
BBD
We could scale to 61x!
61x is super cool, but remember our goal?
100x
With better roads, comes bigger cars!
Server Side Rendering
partially
What actually happens in Server Side Rendering?
Node
from Node, call APIs to get the data
generate HTML
req
Node
from Node, call APIs to get the data
generate HTML
req
1
2
3
Suspicious candidate for bottleneck? 1, 2 or 3?
What happens when you make AJAX calls from Node?
http://api.server:80/
API Server
Node
http://my.server:8080/
http://p.q.r.s:80/
API Server
Node
http://a.b.c.d:8080/
Src IP
Src Port
Dest IP
Dest Port
a.b.c.d
p.q.r.s
80
a.b.c.d
p.q.r.s
80
a.b.c.d
p.q.r.s
80
a.b.c.d
p.q.r.s
80
Ephemeral Ports (32768 - 61000)
32769
32770
32771
32772
So it's possible that you might run out of Ephemeral Ports
And anything which is possible, is practical at Flipkart!
Solution Strategies
1. Connection Pooling
2. Increase Ephemeral Ports
# Linux
$ cat /proc/sys/net/ipv4/ip_local_port_range
32768 61000
And we could finally scale to 94x!
Car Pooling to the rescue in Zootopia!
Intuition doesn't work at Scale
By Ankeet Maini
Intuition doesn't work at Scale
- 227