Image source: my 6 yo daughter

What We will cover

How to Flow

Flow with Requirements

Flow with Operational Constraints

Some Migration Strategies

Don't Stop Flowing

How to Flow

Turbulent Flow is Irreversible

It can be fun, but we don't want that kind of fun

Laminar Flow is Reversible

And we like when things flow smoothly!

Migrations are hard?

Thermo to the rescue!

Reversible processes are optimal:

  • no turbulence,
  • minimal entropy,
  • less complexity,
  • reduced headaches!

Change without pain

Go from A to B in a reversible way


Find your cruise velocity

Prepare a reversal strategy

Flow with Requirements

Circumstances Change

And you should adapt to them

Modern Architecture

Or so we thought

Slightly More modern architecture

Spot the seven differences

Fashion in ARchitecture

80s: minicomputers & terminals

90s: client-server

00s: three tiers

10s: NoSQL

The Perfect Architecture

Does not exist

Flow With Constraints

MediaSmart Mobile

Serve mobile ads

Performance and branding campaigns

6 MM bid offers / day

15M+ impressions / day

50+ servers

30+ countries

700M+ profiles


We help pay for your entertainment

Flow with capacity planning

From 4 to 175+ krps in 2.5 years

Flow for Operational Stability

From 38 to 112 krps in one day

Flow to Lower Costs

How Fast Can You Migrate

to a new cloud provider?

to a new hosting company?

to your own datacenter?

It matters because costs escalate quickly

Database migrations

are painful efforts

but shouldn't be!

How to Migrate Your Database

Build a compatibility layer

Avoid downtime if at all possible

Treat access and data separately

Have a reverse migration strategy

but try not to use it

Compatibility Layer

Adapter pattern (remember those?)

Reduced feature set

Don't use new features

Fake missing features


Redis to Memcached driver:

exports.RedisAdapter = function(name, address) {
    // self-reference
    var self = this;
    // attributes
    var client = driver.getClient(address);
    self.get = function(key, callback) {
        runCommand('get', key, function(error, result) {
            if (error) return callback('Could not get ' + key + ':' + error);
            return callback(null, parse(key, result));

    self.set = function(key, value, expiration, callback) {
        if (expiration) {
            return runCommand('set', key, JSON.stringify(value), 'EX', getExpiration(expiration), callback);
        return runCommand('set', key, JSON.stringify(value), callback);

Adapter in Use

var MemcachedAdapter = require('./memcached.js').MemcachedAdapter;
var RedisAdapter = require('./redis.js').RedisAdapter;
var settings = require('./settings.js');

var db = {
    main: getAdapter('main', settings.MAIN_DB_ADDRESS),

db.main.get('hi', function(error, result) {

function getAdapter(name, address) {
    if (address.indexOf('redis:') === 0) {
        return new RedisAdapter(name, address);
    } else {
        return new MemcachedAdapter(name, address);

Each database configured to point at Redis or Memcached

Case Studies

Surprisingly hard to find

Warning: may not apply to your situation

Migrate and migrate again

We have used the following databases:
  • Couchbase
  • Memcached
  • Redis
  • DynamoDB
  • PostgreSQL
  • RedShift

Different systems have different trade-offs

and show different failure modes

Migration Strategies

Strategies or Patterns?

Battle tested strategies

Not an exhaustive collection

Just some ideas for migrations

Several options for the same requirements

Different reversibility behavior

Server: Stop and Migrate

  • Stop the system
  • Make a cold copy
  • Point clients to new database
  • Start again

Server: Stop and Migrate


module.exports = {
    reidsAddress: '',


var settings = require('./settings.js');
exports.db = {
    current: new RedisAdapter(settings.redisAddress);


var db = require('./db.js');
db.current.get(key, function(error, result) {

Server: Stop and Migrate

Most basic migration

Requires downtime


  • Just point your settings to the old address
  • Stop again and migrate back

Not really reversible

Case Study: MediaSmart VPC

Migration to Amazon virtual private cloud

Tried on 2015-03-03

Reversed on 2015-03-05

Due to an unrelated failure (!)

Tried again on 2015-03-11

Migrated EU on Friday the 13th

because who's afraid of superstitions?

Server: Read-only Version

  • Switch to read-only
  • Make a hot copy
  • Change to new database
  • Switch back to read and write

Server: Read-only Version

Read-only is not always admissible

A hot copy takes longer than a cold one

Reverse migration: switch to read-only again, migrate

Server: synchronize

  • Make a hot copy
  • Synchronize all writes
  • Switch to new copy when ready

Server: synchronize

Depends on server mechanism

No downtime: cool!

Reversal strategy: synchronize back

Full synchronization is hard!

Case Study: MediaSmart Daystats

Migration from Redis to Amazon's Redshift

Set up daily migration of customer stats

Moved old data at our leisure

Query data from one or the other
depending on the date range queried

Trivial reversal

Server: Double Copy

  • Make a hot copy
  • Switch to new database
  • Make another hot copy

Server: Double Copy

Some data loss is admissible

A timestamp is very valuable

Some data loss is inevitable

Reversal: prepare a reverse copy

Case Study: MediaSmart Profiles

Migration of ~120M profiles on 2015-02-16

Moved data from Redis to Amazon's DynamoDB

Lower cost, reasonable memory footprint

Some data loss is admissible

Trivial reversal: use old profiles

Client: Decorator

Pass all queries through an intermediary

Use any condition to select backend

Can be used to balance load

Client: Decorator

Just a clever adapter:

var Memcached = require('memcached');

exports.CleverAdapter = function(name, address) {
    // self-reference
    var self = this;
    // attributes
    var oldAdapter = new Memcached(address + ':11211');
    var newAdapter = new RedisAdapter(address);
    self.get = function(key, callback) {
        if (badWeather()) {
            return oldAdapter.get(key, callback);
        return newAdapter.get(key, callback);
  Downside: a few µs more per query

Client: Dual Lookup

  • Read from one database
  • If not present, try to read from second database

Client: Dual Lookup


exports.db = {
    v1: new RedisAdapter(settings.oldRedis),
    v2: new RedisAdapter(settings.newRedis),

function get(key, callback) {
    db.v1.get(key, function(error, result) {
        if (error || result) callback(error, result);
        db.v2.get(key, callback);

Client: Dual Lookup

Migrate your servers at your leisure

Reversible by design

Now you're talking!

Bad latency issues

from old + new databases

Client: Dual Write

Similar to dual lookup

Latency may not be important

Client: Timed Rollover

  • Date < cutoff: go to the old database
  • Date > cutoff: go to the new database
  • May need some kind of copy

Client: Timed Rollover


var CUTOFF_DATE = '2015-05-13';

function get(key, callback) {
    if (key.substringFrom('#') > CUTOFF_DATE) {
        db.v1.get(key, function(error, result) {
    } else {
        db.v2.get(key, callback);

function set(key, value, expiration, callback) {
    if (new Date.toISOString() > CUTOFF_DATE) {
        db.v1.set(key, value, expiration, function(error, result) {
    } else {
        db.v2.set(key, value, expiration, callback);

Client: Timed Rollover

Useful for sequential data

E.g. statistics, counters

No manual intervention is required

Reversal strategy:

  • change time limit,
  • possibly migrate data,
  • redeploy

Case Study: MediaSmart Mobile

Adding aggregates to daily stats

Improved common queries ~20x

Started adding aggregates in March 2015
If date > 2015-03-25: use aggregates
If date < 2015-03-25: do not use aggregates

Trivial reversal: change setting

Client: In-Place Conversion

  • Read value
  • If in old format, convert and write

Client: In-Place Conversion

Degenerate case: old database == new database

Can change driver, structure, format

Not concurrent

Reversal strategy:

  • Read value
  • If in new format, convert and write

Broker: Proxied Access

  • Read from or write to a proxy
  • Proxy decides where to access each time

Broker: Proxied Access

Can be used with other migration strategies

Typical case: access a Restful API

Another piece to maintain

Increased latency

Use with care

Case Study: Instagram

Migrated from AWS to Facebook datacenters

Year-long effort, from 2013-03 to ~2014-03

Had to go through AWS VPC first

Neti — a dynamic iptables manipulation daemon in Python

Three weeks into VPC, two weeks to FB

Bare minimum approach (!)

Broker: Queued Write

  • Read from first database
  • Write to queue
  • Write to both databases

Broker: Queued Write

Can be used with other migration strategies

Again, typically a Restful API

Avoids high write latencies

Helps ease migrations

Don't Stop Flowing

Strategies work for other things

Adapt them to your situation

How to Migrate Anything

Build one or more compatibility layers

Downtime is just bad engineering

Have a reverse migration strategy

and try not to use it

There Will Be Mistakes

Get over it

Unstable Equilibrium

The only way to fly supersonic fighters

Living with Unstabilities

Use safe defaults

Fail safely

Use a canary

Always monitor

Canary Example

Adtech: Real Time Bidding

Statistics are processed in a queue

Queue writes a canary
that expires in a few minutes

If no canary, stop bidding

Move Fast, Break Things

Or stay put and never break anything

Or anything in between

