Image source: my 6 yo daughter
Alex FernándezSenior Developer @ MediaSmart Mobile
How to Flow
Flow with Requirements
Flow with Operational Constraints
Some Migration Strategies
Don't Stop Flowing
And we like when things flow smoothly!
Reversible processes are optimal:
Go from A to B in a reversible way
(mostly)
Find your cruise velocity
Prepare a reversal strategy
And you should adapt to them
Spot the seven differences
80s: minicomputers & terminals
90s: client-server
00s: three tiers
10s: NoSQL
Does not exist
Serve mobile ads
Performance and branding campaigns
6 MM bid offers / day
15M+ impressions / day
50+ servers
30+ countries
700M+ profiles
We help pay for your entertainment
From 38 to 112 krps in one day
… to a new cloud provider?
… to a new hosting company?
… to your own datacenter?
It matters because costs escalate quickly
are painful efforts
but shouldn't be!
Build a compatibility layer
Avoid downtime if at all possible
Treat access and data separately
Have a reverse migration strategy…
… but try not to use it
Adapter pattern (remember those?)
Reduced feature set
Don't use new features
Fake missing features
Redis to
Memcached driver:
exports.RedisAdapter = function(name, address) { // self-reference var self = this; // attributes var client = driver.getClient(address); self.get = function(key, callback) { runCommand('get', key, function(error, result) { if (error) return callback('Could not get ' + key + ':' + error); return callback(null, parse(key, result)); }); self.set = function(key, value, expiration, callback) { if (expiration) { return runCommand('set', key, JSON.stringify(value), 'EX', getExpiration(expiration), callback); } return runCommand('set', key, JSON.stringify(value), callback); }); };
var MemcachedAdapter = require('./memcached.js').MemcachedAdapter; var RedisAdapter = require('./redis.js').RedisAdapter; var settings = require('./settings.js'); var db = { main: getAdapter('main', settings.MAIN_DB_ADDRESS), }; db.main.get('hi', function(error, result) { }; function getAdapter(name, address) { if (address.indexOf('redis:') === 0) { return new RedisAdapter(name, address); } else { return new MemcachedAdapter(name, address); } }
Surprisingly hard to find
Warning: may not apply to your situation
Different systems have different trade-offs
and show different failure modes
Battle tested strategies
Not an exhaustive collection
Just some ideas for migrations
Several options for the same requirements
Different reversibility behavior
settings.js:
module.exports = {
reidsAddress: 'redis.mydomain.com',
};
db.js:
var settings = require('./settings.js');
exports.db = {
current: new RedisAdapter(settings.redisAddress);
};
user:
var db = require('./db.js');
db.current.get(key, function(error, result) {
...
});
Most basic migration
Requires downtime
Reversal:
Migration to Amazon virtual private cloud
Tried on 2015-03-03
Reversed on 2015-03-05
Due to an unrelated failure (!)
Tried again on 2015-03-11
Migrated EU on Friday the 13th
because who's afraid of superstitions?Read-only is not always admissible
A hot copy takes longer than a cold one
Reverse migration: switch to read-only again, migrate
Depends on server mechanism
No downtime: cool!
Reversal strategy: synchronize back
Full synchronization is hard!
Migration from Redis to Amazon's Redshift
Set up daily migration of customer stats
Some data loss is admissible
A timestamp is very valuable
Some data loss is inevitable
Reversal: prepare a reverse copy
Migration of ~120M profiles on 2015-02-16
Moved data from Redis to Amazon's DynamoDB
Lower cost, reasonable memory footprint
Some data loss is admissible
Pass all queries through an intermediary
Use any condition to select backend
Can be used to balance load
Just a clever adapter:
var Memcached = require('memcached'); exports.CleverAdapter = function(name, address) { // self-reference var self = this; // attributes var oldAdapter = new Memcached(address + ':11211'); var newAdapter = new RedisAdapter(address); self.get = function(key, callback) { if (badWeather()) { return oldAdapter.get(key, callback); } return newAdapter.get(key, callback); } };Downside: a few µs more per query
db.js:
exports.db = {
v1: new RedisAdapter(settings.oldRedis),
v2: new RedisAdapter(settings.newRedis),
};
function get(key, callback) {
db.v1.get(key, function(error, result) {
if (error || result) callback(error, result);
db.v2.get(key, callback);
});
}
Migrate your servers at your leisure
Reversible by design
Now you're talking!
Bad latency issues
from old + new databases
Similar to dual lookup
Latency may not be important
client:
var CUTOFF_DATE = '2015-05-13';
function get(key, callback) {
if (key.substringFrom('#') > CUTOFF_DATE) {
db.v1.get(key, function(error, result) {
} else {
db.v2.get(key, callback);
});
}
function set(key, value, expiration, callback) {
if (new Date.toISOString() > CUTOFF_DATE) {
db.v1.set(key, value, expiration, function(error, result) {
} else {
db.v2.set(key, value, expiration, callback);
});
}
Useful for sequential data
E.g. statistics, counters
No manual intervention is required
Reversal strategy:
Adding aggregates to daily stats
Improved common queries ~20x
Degenerate case: old database == new database
Can change driver, structure, format
Not concurrent
Reversal strategy:
Can be used with other migration strategies
Typical case: access a Restful API
Another piece to maintain
Increased latency
Use with care
Migrated from AWS to Facebook datacenters
Year-long effort, from 2013-03 to ~2014-03
Had to go through AWS VPC first
Neti — a dynamic iptables manipulation daemon in Python
Three weeks into VPC, two weeks to FB
Bare minimum approach (!)
Can be used with other migration strategies
Again, typically a Restful API
Avoids high write latencies
Helps ease migrations
Build one or more compatibility layers
Downtime is just bad engineering
Have a reverse migration strategy…
The only way to fly supersonic fighters
Use safe defaults
Fail safely
Use a canary
Or stay put and never break anything
Or anything in between