PlayON is a sports entertainment company, with customers from over 100 countries.

Official partner of Formula 1 and NBA.

What is PlayON?

The challenge

Build the Official Formula 1 Fantasy game

Start building before contract was signed

2 months deadline, before the start of the season

150k users expected on the first season

Leverage existing daily game structure

Needs to be scalable to other partners

Ruby on Rails on the backend

Ember.js on the frontend

Formula 1 to be the first partner

Planning

How did we plan the development

Eight chunks of work

Divided in eight weeks

Design, Frontend and Backend in parallel

Workshop with Formula 1 team to define the game mechanics

Architecture

High-level blueprint of the architecture used

Daily Backend

F1 Backend

New frontend app

Database

Building the backend

Namespace on the API for White Label operations

Feed processing engine is the same as the Daily game

API built leveraging Daily game structure

We decided to go with a Rails Engine on the Daily game

Ruby 2.3.7 + Rails 4.1.6

Users, teams, pricing, game periods

Built leagues and boosters on top of existing structure

  - Leaderboards

  - Transitioning from Race to Race

  - Live Scores page

3 interesting problems:

All points are valid for Ruby 2.3.7 and Rails 4.1.6

Building the backend - Leaderboards

Leaderboards are calculated in real time for the Daily game

for queries

Query takes some time to run. Joins on 6 really large tables

One instance of Sidekiq with concurrency of 1

7 minutes per race

29K leagues, 12 races in + Overall leaderboard = 377K Leaderboards

1 hour 40 minutes total

Calculate leaderboards every 6 hours

To deal with people joining/leaving leagues

All positions are now a simple lookup in Redis

Building the backend - Race to Race

182k+ teams are copied over from one race to the next and needs to be done as fast as possible

with a concurrency of 5

Eight instances of sidekiq

Rails case insensitive uniqueness validation caused some slowness

Building the backend - Race to Race

Lock Job: Went from 500TPM to 2,880TPM

Unlock Job: Went from 500TPM to 3,300TPM

Use update_column AR method instead of update in order to bypass validations

SQL Query cache is actually disabled for background jobs

Building the backend - Live Scores Page

Chose frontend polling (Ember friendly, not on Rails 5)

And frontend pulls as needed

5K users sitting on the live page, polling at a 15 seconds interval

We cache driver points in Redis, with a 15 second expiry

Turbo driver score and team score calculated on the frontend

Plans to use a CDN as a reverse proxy to improve throughput

Bypass rails completely

20,000 RPM

Building the backend - Miscellanea

We are running on Nginx and Passenger

Formula to work out how many passenger instances we needed in total:

  - 800ms average response time

  - Total processing time for 30K requests

We need to fit this into 60 seconds. To do this, we use 400 instances of passenger

single-threaded mode

wanted to achieve 30K RPM

(800 * 30,000) ms = 24,000 seconds

20 instances of passenger on a total of 20 servers

4 cores, 8GB RAM

Used bullet gem to fix N+1 problems

24,000 seconds / 400 passenger instances = 60 seconds

Displaying league name for a league_entrant record

Used deprecated active loaders gem

find_or_create_by

Use the frontend to offload some calculations whenever possible

Large tables - maybe use a non-relational database? Elastic search?

is not your friend.

Building the Frontend

Why Ember.js ?

  - Daily game is Ember.js

  - Team already familiar

Fresh Ember.js application started

Functionality built in parallel with the API

New design from scratch

Responsive from the ground up

Manual deployment

DevOps

One load balancer for the API instances

Auto scaling for the API instances

4 baseline instances for the API

Same database as the Daily game

20 on auto-scale

Theoretically could have used Passenger enterprise to reduce server numbers

Postgres tuning was performed

Single Nginx server for the frontend

MVP built, time to load test

Ready on the last day, before the first race of the season

Contract still not signed, launch postponed

Load testing started and...

EXPECTATION

REALITY

All API endpoints optimized

Server configuration tuning - Nginx and Passenger

New frontend features and fine-tuning also in parallel

After a week of load testing, we were ready to support 6K concurrent users

Load testing - Flood.io

Ready to support 6K users before performance degrades

Serving around 25K RPM

Ready to launch!

Game launched!

Game launched on 25th of April

36K signups over the first 24 hours

Great repercussion on media

and social media

Trust me, I'm an engineer

Comes the first race weekend and I'm away in Brazil

Race time approaching, traffic building up

first holiday weekend

database stops responding

With not even 3K concurrent users

Restarting the database server brings it up for a few minutes, goes down again

Unavoidable conclusion:

Our postgres setup won't be able to handle it

Second unavoidable conclusion:

Our load testing strategy is not precise enough

Database migration - Saga of a weekend

Saturday 8:00 AM - Database offline

both Daily and Formula 1 games are down

We didn't have the know-how to tune our postgres setup

We were already planning a migration to AWS RDS so...

decided to migrate to RDS

First approach: Use the database migration service to avoid even more downtime

Comes the problems:

postgres 9.3 to 9.4

data type issues

dozens of other issues

Did the upgrade process on a testing database

only to be stopped by more issues

24 hours later, no success

Gave up on DMS

and started a dump and restore

Worked on the first go

data integrity checked and staging site running

Finished testing and ensuring all servers had the DB details updated

Sunday 9:30 PM - All systems up and running on AWS

Post-apocalypse

Healthy load balancers

Healthy API under stress

Database far faster than our self-managed instance

Remaining migration

Full migration was done afterwards, with practically no downtime and with time to do it

Load balancers

Auto scaling

Redis server

Sidekiq server

Production and staging environments

Ember deploy migrated to a CDN structure

Today

182K+ users

3k to 6K concurrent users at peak times

right before and right after the races

100K+ unique monthly visitors

82MI+ requests served monthly across Daily and White Label

Let's play!

bit.ly/playon-ruby-ireland

Questions?

Thank you!

PlayON - Formula1 Fantasy

By Rafael Dalprá

PlayON - Formula1 Fantasy

The story of how F1 fantasy was built

  • 254