PHP at Scale:

Knowing enough to be dangerous!

Oleksii Petrov

Skelia Ukraine / ETWater

Who am I

System Architect

Team Lead

PHP Developer

Find me on

@alexhelkar

alexhelkar

https://github.com/alexhelkar

What does it mean

"to scale"?

Does it mean?

Or does it mean?

Often it means

"To scale"

Improve system metrics without changing the system*

* (dramaticaly)

Scale roadmap is predefined

by your stack and software architecture

Let's go!

Performance

Story

PHP 7 facts

  • isset 1.55 times faster than array_key_exists
  • is_file 26 times faster than file_exists
  • single quotes slower that double quotes
  • instanceof faster is_a
  • etc.

Code performance?

Blackfire.io & Symfony Blog

Current results:

Total request time: 18.4 ms

file_exists called: 7 times

Exec. time for file_exists: 460 µs

 

Expected results:

Exec/ time: ~17.6 µs

Total request time: ~18.3996 ms

PHP lang performance

...овно?

OR

...амно?

Why a system is slow?

Because of Database!

Scaling strategies

  • Caching
  • Queueing
  • Reads/Writes Spliting
  • Sharding

Caching

Typical Request Flow

DB Cache

Application Cache

Web Cache

Cache: Rule of thumb

IF Application is SLOW

Enable Caching

IF Application has GLITCHES

Disable Caching

Queueing

Queue as The Equalizer

Queue as The Equalizer

Problems?

Delays

Queue as Load Balancer

Problems?

Unbounded buffers!

Read/Write Splitting

Master-Slave replication

async/semi-sync

Writes

Reads

Reads

Problems?

Replication lag

Galera Replication

aka sync replication

Problems?

Writes Performance

What is IoT?

IoT is like

Canvas

  • REST API
  • Weather Data
  • Avg. Request Size 325b

Data Sample

{
   "deviceId": 231,
   "lat":"45.22838254",
   "lng":"-114.23725403",
   "timestamp":1459509524,
   "temperature":2.99,
   "precipProbability":0.386,
   "humidity":0.055,
   "pressure":1035.617
}

Load Generator

Yandex Tank

Tasks

  • Create platform setup
  • Gather performance metrics
  • Calculate budgets

Plan 1: 86k req/day (1 rps)

Plan 2: 8.64M req/day (100 rps)

Setup: 20$

Response Time

Limitation

Same Setup: 200rps

Why 3 database
servers?

Probability?

Head: 50% | Tails: 50%

Probability of 2 consecutive Heads?

25%

Probability of 3 consecutive Heads?

12.5% 

0.5^3

Servers reliability

1 Server

99%

10 Servers

90.4%

100 Servers

36.6%

 

Distributed System 
 

High Availability

Any

should consider

Plan 4: 43.2M req/day (500rps)

Setup Variant 1: 380$

Quantiles

Setup Variant 2: 150$

Quantiles

Mircoservices

230 Reasons to choose

380$ - 150$ = 230$

Plan 5: 84.6M req/day (1k rps)

Plan 5: 84.6M req/day (1k rps)

Plan 6: 432M req/day (5k rps)

Started to think about bulk inserts?

Plan 7: 1.26B req/day (15k rps)

Quantiles

TCP/IP

TCP Flow

Packets Rate

Packets Rate

1 Request 

~10 Packets 

15k Requests 

~15k Packets 

Load Balancer Trick

Load Balancer Trick

Load Balancer Trick

Load Balancer Trick

Why?

Load Balancer Trick

75k pack.

75k pack.

LB Recieved: 150k packets

Plan 7: 50000 req/sec

Commodity Hardware

Commodity Hardware

$5 Server

1 CPU

20 CPU

$100 Server

$640 Server

c4.8xlarge: $1222.750/monthly

DNS Load Distribution

DNS Load Distribution

DNS Cache

Rule of Thumb: Add LB

What if?

Floating IP

Sharding?

Last Resort

Mysql: Manual Sharding

Mysql: Fabric (pre-alpha)

<?php

$mysqli = new mysqli("myapp", "user", "password", "database");


mysqlnd_ms_fabric_select_shard($mysqli, "test.fabrictest", 10);
$mysqli->query("INSERT INTO fabrictest(id) VALUES (10)");


mysqlnd_ms_fabric_select_shard($mysqli, "test.fabrictest", 10);
$mysqli->query("SELECT id FROM test WHERE id = 10");

http://php.net/manual/ru/mysqlnd-ms.quickstart.mysql_fabric.php

MariaDB: Spider

Mysql Cluster (NDB)

MongoDB Sharding

Apache Cassandra

RabbitMQ

Hits 1 Million Messages Per Second. 32 machines

https://blog.pivotal.io/pivotal/products/rabbitmq-hits-one-million-messages-per-second-on-google-compute-engine

Apache Kafka

2 Million Writes Per Second 3 machines

https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

Instead Summary

  • Scalability as afterwords doesn't work
  • Database is not a queue
  • Database is not a lock system
  • Redis is not a queue
  • Load balancers as a must
  • Linear disk writes wins
  • Perfomance in PHP is done by everything except PHP
  • Being distributed is freaking hard

Questions?

Find me on

@alexhelkar

alexhelkar

https://github.com/alexhelkar