The Fluid

Architecture

Your Host Tonight

Image source: my 6 yo daughter

Alex Fernández

Developer with 15+ years of experience

@pinchito

What We will cover

How to Flow


Flow with Requirements


Flow with Operational Constraints


Some Relevant Case Studies


Migration Strategies


Don't Stop Flowing

How to Flow


Turbulent Flow is Irreversible


It can be fun, but we don't want that kind of fun

Laminar Flow is Reversible


And we like when things flow smoothly!

Migrations are hard?

Thermo to the rescue!


Reversible processes are optimal:

  • no turbulence,
  • minimal entropy,
  • less complexity,
  • reduced headaches!

Change without pain


Go from A to B in a reversible way

(mostly)


Find your cruise velocity


Prepare a reversal strategy

Flow with Requirements

Circumstances Change

And you should adapt to them

Fashion in ARchitecture


80s: minicomputers & terminals


90s: client-server


00s: three tiers


10s: NoSQL

The Perfect Architecture

Does not exist

Modern Architecture


Or so we thought

Slightly More modern architecture

Spot the seven differences

Flow With Constraints


Flow with capacity planning


From 4 to 150+ krps in 2.5 years

Flow for Operational Stability

From 38 to 112 krps in one day

Flow to Lower Costs

How Fast Can You Migrate


to a new cloud provider?


to a new hosting company?


to your own datacenter?


It matters because costs escalate quickly

Database migrations

are painful efforts

but shouldn't be

How to Migrate Your Database


Build a compatibility layer


Avoid downtime if at all possible


Treat access and data separately


Have a reverse migration strategy


but try not to use it

Compatibility Layer


Adapter pattern (remember those?)


Reduced feature set


Don't use new features


Fake missing features


Adapter

Redis to Memcached driver:

exports.RedisAdapter = function(name, address) {
    // self-reference
    var self = this;
        
    // attributes
    var client = driver.getClient(address);
        
    self.get = function(key, callback) {
        runCommand('get', key, function(error, result) {
            if (error) return callback('Could not get ' + key + ':' + error);
            return callback(null, parse(key, result));
    });

    self.set = function(key, value, expiration, callback) {
        if (expiration) {
            return runCommand('set', key, JSON.stringify(value), 'EX', getExpiration(expiration), callback);
        }
        return runCommand('set', key, JSON.stringify(value), callback);
    });
};

Migrate and migrate again


We have used the following databases:
  • Couchbase
  • Memcached
  • Redis
  • DynamoDB
  • PostgreSQL
  • RedShift


Different systems have different trade-offs

and show different failure modes


Case Studies

Warning: may not apply to your situation

MediaSmart Mobile


Migration to Amazon virtual private cloud


Tried on 2015-03-03

Reversed on 2015-03-05

Due to an unrelated failure (!)


Tried again on 2015-03-11

Migrated EU on Friday the 13th

because who's afraid of superstitions?

Instagram


Migrated from AWS to Facebook datacenters


Year-long effort, from 2013-03 to ~2014-03


Had to go through AWS VPC first

Neti — a dynamic iptables manipulation daemon in Python

Three weeks into VPC, two weeks to FB


Bare minimum approach (!)

Big Adtech Company X


Also migrating datacenters


Will start sending 10% of traffic

from new datacenter


No published materials


Inter-datacenter latencies

Migration Strategies

Server: Stop and Migrate




  • Stop the system
  • Make a cold copy
  • Point clients to new database
  • Start again

Server: Stop and Migrate


settings.js:

module.exports = {
    reidsAddress: 'redis.mydomain.com',
};

db.js:

var settings = require('./settings.js');
exports.db = {
    current: new RedisAdapter(settings.redisAddress);
};

user:

var db = require('./db.js');
db.current.get(key, function(error, result) {
    ...
});

Server: Stop and Migrate


Most basic migration


Requires downtime


Reversal:

  • Just point your settings to the old address
  • Stop again and migrate back


Not really reversible

Server: Read-only Version



  • Switch to read-only
  • Make a hot copy
  • Change to new database
  • Switch back to read and write

Server: Read-only Version


Read-only is not always admissible


A hot copy takes longer than a cold one


Reverse migration: switch to read-only again, migrate

Server: synchronize



  • Make a hot copy
  • Synchronize all writes
  • Switch to new copy when ready

Server: synchronize


Depends on server mechanism


No downtime: cool!


Reversal strategy: synchronize back


Full synchronization is hard!

Server: Double Copy



  • Make a hot copy
  • Switch to new database
  • Make another hot copy

Server: Double Copy


Some data loss is admissible

e.g.: 500M profiles


A timestamp is very valuable


Some data loss is inevitable


Reversal: prepare a reverse copy

Decorator

Just a clever adapter:

var Memcached = require('memcached');

exports.CleverAdapter = function(name, address) {
    // self-reference
    var self = this;
    
    // attributes
    var oldAdapter = new Memcached(address + ':11211');
    var newAdapter = new RedisAdapter(address);
        
    self.get = function(key, callback) {
        if (badWeather()) {
            return oldAdapter.get(key, callback);
        }
        return newAdapter.get(key, callback);
    }
};
  Downside: a few µs more per query

Client: Dual Lookup



  • Read from one database
  • If not present, try to read from second database

Client: Dual Lookup


db.js:

exports.db = {
    v1: new RedisAdapter(settings.oldRedis),
    v2: new RedisAdapter(settings.newRedis),
};


client:
function get(key, callback) {
    db.v1.get(key, function(error, result) {
        if (error || result) callback(error, result);
        db.v2.get(key, callback);
    });
}

Client: Dual Lookup


Migrate your servers at your leisure


Reversible by design


Now you're talking!


Bad latency issues

from old + new databases

Client: Dual Write



Similar to dual lookup


Latency may not be important

Client: Timed Rollover



  • If date >


Client: Timed Rollover


client:

var CUTOFF_DATE = '2015-05-13';

function get(key, callback) {
    if (key.substringFrom('#') > CUTOFF_DATE) {
        db.v1.get(key, function(error, result) {
    } else {
        db.v2.get(key, callback);
    });
}

function set(key, value, expiration, callback) {
    if (new Date.toISOString() > CUTOFF_DATE) {
        db.v1.set(key, value, expiration, function(error, result) {
    } else {
        db.v2.set(key, value, expiration, callback);
    });
}

Client: Timed Rollover


Useful for sequential data

E.g. statistics, counters


No manual intervention is required


Reversal strategy:

  • change time limit,
  • possibly migrate data,
  • redeploy

Client: In-Place Conversion



  • Read value
  • If in old format, convert and write


Client: In-Place Conversion


Degenerate case: old database == new database


Can change driver, structure, format


Not concurrent


Reversal strategy:

  • Read value
  • If in new format, convert and write

Combined: Proxied Access


  • Read from or write to a proxy
  • Proxy decides where to access each time

Combined: Proxied Access


Can be used with other migration strategies


Another piece to maintain


Increased latency


Use with care

Combined: Queued Write



  • Read from first database
  • Write to queue
  • Write to both databases

Combined: Queued Write


Can be used with other migration strategies


Avoids high write latencies


Helps ease migrations

Don't Stop Flowing

Strategies work for other things


Adapt them to your situation

How to Migrate Anything


Build one or more compatibility layers


Downtime is just bad engineering


Have a reverse migration strategy


and try not to use it

Unstable Equilibrium

The only way to fly supersonic fighters

Living with Unstabilities


Use safe defaults


Fail safely


Use a canary


Always monitor

Canary Example


Adtech: Real Time Bidding

Statistics are processed in a queue

Queue writes a canary
that expires in a few minutes

If no canary, stop bidding

There Will Be Mistakes


Get over it

Move Fast, Break Things

Or stay put and never break anything

Or anything in between

Thanks!

@pinchito

The Fluid Architecture

By Alex Fernández

The Fluid Architecture

Talk for JSDay Verona, 2015-05-13

  • 4,681