The Fluid


Your Host Tonight

Image source: my 6 yo daughter

Alex Fernández

Developer with 15+ years of experience


What We will cover

How to Flow

Flow with Requirements

Flow with Operational Constraints

Some Relevant Case Studies

Migration Strategies

Don't Stop Flowing

How to Flow

Turbulent Flow is Irreversible

It can be fun, but we don't want that kind of fun

Laminar Flow is Reversible

And we like when things flow smoothly!

Migrations are hard?

Thermo to the rescue!

Reversible processes are optimal:

  • no turbulence,
  • minimal entropy,
  • less complexity,
  • reduced headaches!

Change without pain

Go from A to B in a reversible way


Find your cruise velocity

Prepare a reversal strategy

Flow with Requirements

Circumstances Change

And you should adapt to them

Fashion in ARchitecture

80s: minicomputers & terminals

90s: client-server

00s: three tiers

10s: NoSQL

The Perfect Architecture

Does not exist

Modern Architecture

Or so we thought

Slightly More modern architecture

Spot the seven differences

Flow With Constraints

Flow with capacity planning

From 4 to 150+ krps in 2.5 years

Flow for Operational Stability

From 38 to 112 krps in one day

Flow to Lower Costs

How Fast Can You Migrate

to a new cloud provider?

to a new hosting company?

to your own datacenter?

It matters because costs escalate quickly

Database migrations

are painful efforts

but shouldn't be

How to Migrate Your Database

Build a compatibility layer

Avoid downtime if at all possible

Treat access and data separately

Have a reverse migration strategy

but try not to use it

Compatibility Layer

Adapter pattern (remember those?)

Reduced feature set

Don't use new features

Fake missing features


Redis to Memcached driver:

exports.RedisAdapter = function(name, address) {
    // self-reference
    var self = this;
    // attributes
    var client = driver.getClient(address);
    self.get = function(key, callback) {
        runCommand('get', key, function(error, result) {
            if (error) return callback('Could not get ' + key + ':' + error);
            return callback(null, parse(key, result));

    self.set = function(key, value, expiration, callback) {
        if (expiration) {
            return runCommand('set', key, JSON.stringify(value), 'EX', getExpiration(expiration), callback);
        return runCommand('set', key, JSON.stringify(value), callback);

Migrate and migrate again

We have used the following databases:
  • Couchbase
  • Memcached
  • Redis
  • DynamoDB
  • PostgreSQL
  • RedShift

Different systems have different trade-offs

and show different failure modes

Case Studies

Warning: may not apply to your situation

MediaSmart Mobile

Migration to Amazon virtual private cloud

Tried on 2015-03-03

Reversed on 2015-03-05

Due to an unrelated failure (!)

Tried again on 2015-03-11

Migrated EU on Friday the 13th

because who's afraid of superstitions?


Migrated from AWS to Facebook datacenters

Year-long effort, from 2013-03 to ~2014-03

Had to go through AWS VPC first

Neti — a dynamic iptables manipulation daemon in Python

Three weeks into VPC, two weeks to FB

Bare minimum approach (!)

Big Adtech Company X

Also migrating datacenters

Will start sending 10% of traffic

from new datacenter

No published materials

Inter-datacenter latencies

Migration Strategies

Server: Stop and Migrate

  • Stop the system
  • Make a cold copy
  • Point clients to new database
  • Start again

Server: Stop and Migrate


module.exports = {
    reidsAddress: '',


var settings = require('./settings.js');
exports.db = {
    current: new RedisAdapter(settings.redisAddress);


var db = require('./db.js');
db.current.get(key, function(error, result) {

Server: Stop and Migrate

Most basic migration

Requires downtime


  • Just point your settings to the old address
  • Stop again and migrate back

Not really reversible

Server: Read-only Version

  • Switch to read-only
  • Make a hot copy
  • Change to new database
  • Switch back to read and write

Server: Read-only Version

Read-only is not always admissible

A hot copy takes longer than a cold one

Reverse migration: switch to read-only again, migrate

Server: synchronize

  • Make a hot copy
  • Synchronize all writes
  • Switch to new copy when ready

Server: synchronize

Depends on server mechanism

No downtime: cool!

Reversal strategy: synchronize back

Full synchronization is hard!

Server: Double Copy

  • Make a hot copy
  • Switch to new database
  • Make another hot copy

Server: Double Copy

Some data loss is admissible

e.g.: 500M profiles

A timestamp is very valuable

Some data loss is inevitable

Reversal: prepare a reverse copy


Just a clever adapter:

var Memcached = require('memcached');

exports.CleverAdapter = function(name, address) {
    // self-reference
    var self = this;
    // attributes
    var oldAdapter = new Memcached(address + ':11211');
    var newAdapter = new RedisAdapter(address);
    self.get = function(key, callback) {
        if (badWeather()) {
            return oldAdapter.get(key, callback);
        return newAdapter.get(key, callback);
  Downside: a few µs more per query

Client: Dual Lookup

  • Read from one database
  • If not present, try to read from second database

Client: Dual Lookup


exports.db = {
    v1: new RedisAdapter(settings.oldRedis),
    v2: new RedisAdapter(settings.newRedis),

function get(key, callback) {
    db.v1.get(key, function(error, result) {
        if (error || result) callback(error, result);
        db.v2.get(key, callback);

Client: Dual Lookup

Migrate your servers at your leisure

Reversible by design

Now you're talking!

Bad latency issues

from old + new databases

Client: Dual Write

Similar to dual lookup

Latency may not be important

Client: Timed Rollover

  • If date >

Client: Timed Rollover


var CUTOFF_DATE = '2015-05-13';

function get(key, callback) {
    if (key.substringFrom('#') > CUTOFF_DATE) {
        db.v1.get(key, function(error, result) {
    } else {
        db.v2.get(key, callback);

function set(key, value, expiration, callback) {
    if (new Date.toISOString() > CUTOFF_DATE) {
        db.v1.set(key, value, expiration, function(error, result) {
    } else {
        db.v2.set(key, value, expiration, callback);

Client: Timed Rollover

Useful for sequential data

E.g. statistics, counters

No manual intervention is required

Reversal strategy:

  • change time limit,
  • possibly migrate data,
  • redeploy

Client: In-Place Conversion

  • Read value
  • If in old format, convert and write

Client: In-Place Conversion

Degenerate case: old database == new database

Can change driver, structure, format

Not concurrent

Reversal strategy:

  • Read value
  • If in new format, convert and write

Combined: Proxied Access

  • Read from or write to a proxy
  • Proxy decides where to access each time

Combined: Proxied Access

Can be used with other migration strategies

Another piece to maintain

Increased latency

Use with care

Combined: Queued Write

  • Read from first database
  • Write to queue
  • Write to both databases

Combined: Queued Write

Can be used with other migration strategies

Avoids high write latencies

Helps ease migrations

Don't Stop Flowing

Strategies work for other things

Adapt them to your situation

How to Migrate Anything

Build one or more compatibility layers

Downtime is just bad engineering

Have a reverse migration strategy

and try not to use it

Unstable Equilibrium

The only way to fly supersonic fighters

Living with Unstabilities

Use safe defaults

Fail safely

Use a canary

Always monitor

Canary Example

Adtech: Real Time Bidding

Statistics are processed in a queue

Queue writes a canary
that expires in a few minutes

If no canary, stop bidding

There Will Be Mistakes

Get over it

Move Fast, Break Things

Or stay put and never break anything

Or anything in between