The Fluid
Architecture
Your Host Tonight
Image source: my 6 yo daughter
Alex FernándezSenior Developer @ MediaSmart Mobile
What We will cover
How to Flow
Flow with Requirements
Flow with Operational Constraints
Some Migration Strategies
Don't Stop Flowing
How to Flow
Turbulent Flow is Irreversible
It can be fun, but we don't want that kind of fun
Laminar Flow is Reversible
And we like when things flow smoothly!
Migrations are hard?
Thermo to the rescue!
Reversible processes are optimal:
- no turbulence,
- minimal entropy,
- less complexity,
- reduced headaches!
Change without pain
Go from A to B in a reversible way
(mostly)
Find your cruise velocity
Prepare a reversal strategy
Flow with Requirements
Circumstances Change
And you should adapt to them
Modern Architecture
Or so we thought
Slightly More modern architecture
Spot the seven differences
Fashion in ARchitecture
80s: minicomputers & terminals
90s: client-server
00s: three tiers
10s: NoSQL
The Perfect Architecture
Does not exist
Flow With Constraints
MediaSmart Mobile
Serve mobile ads
Performance and branding campaigns
6 MM bid offers / day
15M+ impressions / day
50+ servers
30+ countries
700M+ profiles
Guilty!
We help pay for your entertainment
Flow with capacity planning
From 4 to 175+ krps in 2.5 years
Flow for Operational Stability
From 38 to 112 krps in one day
Flow to Lower Costs
How Fast Can You Migrate…
… to a new cloud provider?
… to a new hosting company?
… to your own datacenter?
It matters because costs escalate quickly
Database migrations
are painful efforts
but shouldn't be!
How to Migrate Your Database
Build a compatibility layer
Avoid downtime if at all possible
Treat access and data separately
Have a reverse migration strategy…
… but try not to use it
Compatibility Layer
Adapter pattern (remember those?)
Reduced feature set
Don't use new features
Fake missing features
Adapter
Redis to
Memcached driver:
exports.RedisAdapter = function(name, address) { // self-reference var self = this; // attributes var client = driver.getClient(address); self.get = function(key, callback) { runCommand('get', key, function(error, result) { if (error) return callback('Could not get ' + key + ':' + error); return callback(null, parse(key, result)); }); self.set = function(key, value, expiration, callback) { if (expiration) { return runCommand('set', key, JSON.stringify(value), 'EX', getExpiration(expiration), callback); } return runCommand('set', key, JSON.stringify(value), callback); }); };
Adapter in Use
var MemcachedAdapter = require('./memcached.js').MemcachedAdapter; var RedisAdapter = require('./redis.js').RedisAdapter; var settings = require('./settings.js'); var db = { main: getAdapter('main', settings.MAIN_DB_ADDRESS), }; db.main.get('hi', function(error, result) { }; function getAdapter(name, address) { if (address.indexOf('redis:') === 0) { return new RedisAdapter(name, address); } else { return new MemcachedAdapter(name, address); } }
Each database configured to point at Redis or Memcached
Case Studies
Surprisingly hard to find
Warning: may not apply to your situation
Migrate and migrate again
- Couchbase
- Memcached
- Redis
- DynamoDB
- PostgreSQL
- RedShift
Different systems have different trade-offs
and show different failure modes
Migration Strategies
Strategies or Patterns?
Battle tested strategies
Not an exhaustive collection
Just some ideas for migrations
Several options for the same requirements
Different reversibility behavior
Server: Stop and Migrate
- Stop the system
- Make a cold copy
- Point clients to new database
- Start again
Server: Stop and Migrate
settings.js:
module.exports = {
reidsAddress: 'redis.mydomain.com',
};
db.js:
var settings = require('./settings.js');
exports.db = {
current: new RedisAdapter(settings.redisAddress);
};
user:
var db = require('./db.js');
db.current.get(key, function(error, result) {
...
});
Server: Stop and Migrate
Most basic migration
Requires downtime
Reversal:
- Just point your settings to the old address
- Stop again and migrate back
Not really reversible
Case Study: MediaSmart VPC
Migration to Amazon virtual private cloud
Tried on 2015-03-03
Reversed on 2015-03-05
Due to an unrelated failure (!)
Tried again on 2015-03-11
Migrated EU on Friday the 13th
because who's afraid of superstitions?Server: Read-only Version
- Switch to read-only
- Make a hot copy
- Change to new database
- Switch back to read and write
Server: Read-only Version
Read-only is not always admissible
A hot copy takes longer than a cold one
Reverse migration: switch to read-only again, migrate
Server: synchronize
- Make a hot copy
- Synchronize all writes
- Switch to new copy when ready
Server: synchronize
Depends on server mechanism
No downtime: cool!
Reversal strategy: synchronize back
Full synchronization is hard!
Case Study: MediaSmart Daystats
Migration from Redis to Amazon's Redshift
Set up daily migration of customer stats
Query data from one or the other
depending on the date range queried
Trivial reversal
Server: Double Copy
- Make a hot copy
- Switch to new database
- Make another hot copy
Server: Double Copy
Some data loss is admissible
A timestamp is very valuable
Some data loss is inevitable
Reversal: prepare a reverse copy
Case Study: MediaSmart Profiles
Migration of ~120M profiles on 2015-02-16
Moved data from Redis to Amazon's DynamoDB
Lower cost, reasonable memory footprint
Some data loss is admissible
Trivial reversal: use old profiles
Client: Decorator
Pass all queries through an intermediary
Use any condition to select backend
Can be used to balance load
Client: Decorator
Just a clever adapter:
var Memcached = require('memcached'); exports.CleverAdapter = function(name, address) { // self-reference var self = this; // attributes var oldAdapter = new Memcached(address + ':11211'); var newAdapter = new RedisAdapter(address); self.get = function(key, callback) { if (badWeather()) { return oldAdapter.get(key, callback); } return newAdapter.get(key, callback); } };Downside: a few µs more per query
Client: Dual Lookup
- Read from one database
- If not present, try to read from second database
Client: Dual Lookup
db.js:
exports.db = {
v1: new RedisAdapter(settings.oldRedis),
v2: new RedisAdapter(settings.newRedis),
};
function get(key, callback) {
db.v1.get(key, function(error, result) {
if (error || result) callback(error, result);
db.v2.get(key, callback);
});
}
Client: Dual Lookup
Migrate your servers at your leisure
Reversible by design
Now you're talking!
Bad latency issues
from old + new databases
Client: Dual Write
Similar to dual lookup
Latency may not be important
Client: Timed Rollover
- Date < cutoff: go to the old database
- Date > cutoff: go to the new database
- May need some kind of copy
Client: Timed Rollover
client:
var CUTOFF_DATE = '2015-05-13';
function get(key, callback) {
if (key.substringFrom('#') > CUTOFF_DATE) {
db.v1.get(key, function(error, result) {
} else {
db.v2.get(key, callback);
});
}
function set(key, value, expiration, callback) {
if (new Date.toISOString() > CUTOFF_DATE) {
db.v1.set(key, value, expiration, function(error, result) {
} else {
db.v2.set(key, value, expiration, callback);
});
}
Client: Timed Rollover
Useful for sequential data
E.g. statistics, counters
No manual intervention is required
Reversal strategy:
- change time limit,
- possibly migrate data,
- redeploy
Case Study: MediaSmart Mobile
Adding aggregates to daily stats
Improved common queries ~20x
If date > 2015-03-25: use aggregates
If date < 2015-03-25: do not use aggregates
Trivial reversal: change setting
Client: In-Place Conversion
- Read value
- If in old format, convert and write
Client: In-Place Conversion
Degenerate case: old database == new database
Can change driver, structure, format
Not concurrent
Reversal strategy:
- Read value
- If in new format, convert and write
Broker: Proxied Access
- Read from or write to a proxy
- Proxy decides where to access each time
Broker: Proxied Access
Can be used with other migration strategies
Typical case: access a Restful API
Another piece to maintain
Increased latency
Use with care
Case Study: Instagram
Migrated from AWS to Facebook datacenters
Year-long effort, from 2013-03 to ~2014-03
Had to go through AWS VPC first
Neti — a dynamic iptables manipulation daemon in Python
Three weeks into VPC, two weeks to FB
Bare minimum approach (!)
Broker: Queued Write
- Read from first database
- Write to queue
- Write to both databases
Broker: Queued Write
Can be used with other migration strategies
Again, typically a Restful API
Avoids high write latencies
Helps ease migrations
Don't Stop Flowing
Strategies work for other things
Adapt them to your situation
How to Migrate Anything
Build one or more compatibility layers
Downtime is just bad engineering
Have a reverse migration strategy…
There Will Be Mistakes
Get over it
Unstable Equilibrium
The only way to fly supersonic fighters
Living with Unstabilities
Use safe defaults
Fail safely
Use a canary
Always monitor
Canary Example
Statistics are processed in a queue
Queue writes a canary
that expires in a few minutes
If no canary, stop bidding
Move Fast, Break Things
Or stay put and never break anything
Or anything in between
Thanks!
The Fluid Architecture
By Alex Fernández
The Fluid Architecture
Talk for MediterráneaJS Barcelona, 2015-06-23.
- 4,077