PHP at Scale:
Knowing enough to be dangerous!
Oleksii Petrov
Skelia Ukraine / ETWater

Who am I
System Architect
Team Lead
PHP Developer

Find me on
@alexhelkar



alexhelkar
https://github.com/alexhelkar
What does it mean
"to scale"?
Does it mean?

Or does it mean?

Often it means

"To scale"
Improve system metrics without changing the system*
* (dramaticaly)
Scale roadmap is predefined
by your stack and software architecture

Let's go!
Performance
Story
PHP 7 facts
- isset 1.55 times faster than array_key_exists
- is_file 26 times faster than file_exists
- single quotes slower that double quotes
- instanceof faster is_a
- etc.
Code performance?
Blackfire.io & Symfony Blog

Current results:
Total request time: 18.4 ms
file_exists called: 7 times
Exec. time for file_exists: 460 µs
Expected results:
Exec/ time: ~17.6 µs
Total request time: ~18.3996 ms
PHP lang performance

...овно?
OR

...амно?
Why a system is slow?
Because of Database!

Scaling strategies
- Caching
- Queueing
- Reads/Writes Spliting
- Sharding
Caching
Typical Request Flow

DB Cache

Application Cache

Web Cache

Cache: Rule of thumb
IF Application is SLOW
Enable Caching
IF Application has GLITCHES
Disable Caching
Queueing
Queue as The Equalizer

Queue as The Equalizer



Problems?
Delays
Queue as Load Balancer

Problems?
Unbounded buffers!
Read/Write Splitting
Master-Slave replication
async/semi-sync

Writes
Reads
Reads
Problems?
Replication lag
Galera Replication
aka sync replication

Problems?
Writes Performance
What is IoT?
IoT is like

Canvas
- REST API
- Weather Data
- Avg. Request Size 325b
Data Sample
{
"deviceId": 231,
"lat":"45.22838254",
"lng":"-114.23725403",
"timestamp":1459509524,
"temperature":2.99,
"precipProbability":0.386,
"humidity":0.055,
"pressure":1035.617
}
Load Generator
Yandex Tank

Tasks
- Create platform setup
- Gather performance metrics
- Calculate budgets
Plan 1: 86k req/day (1 rps)

Plan 2: 8.64M req/day (100 rps)

Setup: 20$

Response Time

Limitation

Same Setup: 200rps

Why 3 database
servers?
Probability?

Head: 50% | Tails: 50%
Probability of 2 consecutive Heads?

25%
Probability of 3 consecutive Heads?

12.5%
0.5^3
Servers reliability
1 Server
99%
10 Servers
90.4%
100 Servers
36.6%
Distributed System
High Availability
Any
should consider
Plan 4: 43.2M req/day (500rps)

Setup Variant 1: 380$

Quantiles

Setup Variant 2: 150$

Quantiles

Mircoservices
230 Reasons to choose
380$ - 150$ = 230$
Plan 5: 84.6M req/day (1k rps)

Plan 5: 84.6M req/day (1k rps)

Plan 6: 432M req/day (5k rps)

Started to think about bulk inserts?
Plan 7: 1.26B req/day (15k rps)

Quantiles

TCP/IP
TCP Flow

Packets Rate

Packets Rate
1 Request
~10 Packets
15k Requests
~15k Packets
Load Balancer Trick

Load Balancer Trick

Load Balancer Trick

Load Balancer Trick

Why?
Load Balancer Trick

75k pack.
75k pack.
LB Recieved: 150k packets
Plan 7: 50000 req/sec

Commodity Hardware
Commodity Hardware
$5 Server
1 CPU
20 CPU
$100 Server
$640 Server
c4.8xlarge: $1222.750/monthly

DNS Load Distribution

DNS Load Distribution

DNS Cache

Rule of Thumb: Add LB

What if?

Floating IP

Sharding?
Last Resort
Mysql: Manual Sharding

Mysql: Fabric (pre-alpha)
<?php
$mysqli = new mysqli("myapp", "user", "password", "database");
mysqlnd_ms_fabric_select_shard($mysqli, "test.fabrictest", 10);
$mysqli->query("INSERT INTO fabrictest(id) VALUES (10)");
mysqlnd_ms_fabric_select_shard($mysqli, "test.fabrictest", 10);
$mysqli->query("SELECT id FROM test WHERE id = 10");
http://php.net/manual/ru/mysqlnd-ms.quickstart.mysql_fabric.php
MariaDB: Spider

Mysql Cluster (NDB)
MongoDB Sharding

Apache Cassandra

RabbitMQ
Hits 1 Million Messages Per Second. 32 machines
https://blog.pivotal.io/pivotal/products/rabbitmq-hits-one-million-messages-per-second-on-google-compute-engine
Apache Kafka
2 Million Writes Per Second 3 machines
https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
Instead Summary
- Scalability as afterwords doesn't work
- Database is not a queue
- Database is not a lock system
- Redis is not a queue
- Load balancers as a must
- Linear disk writes wins
- Perfomance in PHP is done by everything except PHP
- Being distributed is freaking hard
Questions?
Find me on
@alexhelkar



alexhelkar
https://github.com/alexhelkar
PHP at Scale: Knowing enough to be dangerous!
By Oleksii Petrov
PHP at Scale: Knowing enough to be dangerous!
- 817