An ode to 17 databases in 33 minutes
Featuring a variety of fairly mangled and variously mixed metaphors.
And animated gifs.
And probably star wars.
And some D&D. For no real reason.
Because I am the walking embodiment of a stereotype.
Dramatis Personae
-
BigTable
- Cassandra
- Couch
- DynamoDb
- ElasticSearch
- Hyperdex
- Hadoop
- Datomic
- PostgreSQL
- Memcache
- MongoDb
- MySQL
- Neo4j
- Redis
- RethinkDB
- Riak
- UnQLite
Hi
I'm Toby
@tobyhede
@ninefold
High Performance Rails Hosting
In a cloud. And stuff.
Straya
Actual FACTS
Technically this slide is about star wars
Distributed Systems are Hard
Databases are Fun
Distributed Systems are Fun
Databases are Hard
NoSQL
NewSQL
PostSQL
Post-Rock Ambient Punk SQL
ACID
Atomic
Consistent
Isolated
Durable
CAPtain murica
CAP Theorem
Brewer's Conjecture
CAP
Consistency
Availability
Partition Tolerance
Conjecture
and
Formal Proof
Almost, but not quite, entirely unalike
Theorem 1It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:
- Availability;
- Atomic consistency in all fair executions (including those in which messages are lost).
Nancy Lynch & Seth Gilbert, Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services, ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 51-59
BASE
Basically
Available
Soft
State
Everything is Distributed
Almost, but not quite, entirely untrue
P ? C : A
Cluster of Nodes
SCIENCE!
OMG
Note to self:
Probably
don't do
the next bit
Links
What's the deal with Owl Bears?
PostgreSQL
MySQL for hipsters
jk, it's totes awesome
Name | PostgreSQL |
---|---|
Class | Relational |
Alignment | Consistent |
API | SQL over TCP/IP |
Feats | Hstore | JSON & PL/V8 | Foreign Data Wrappers |
---|---|---|---|
Saving Throws |
|
Hstore
CREATE TABLE characters (
id serial PRIMARY KEY,
name TEXT,
attributes hstore
);
INSERT INTO characters (name, attributes) VALUES (
'postgresql',
'class => "relational",
skills => joins,
api => SQL over TCP/IP'
);
activerecord-postgres-hstore
class Programmer < ActiveRecord::Base serialize :languages, ActiveRecord::Coders::Hstore end # Contains key 'ruby' Programmer.where("languages ? 'ruby'") # Key 'ruby' has value '2.0' Programmer.where("languages -> 'ruby' = '2.0'") # And slightly different for no extra charge Programmer.where("languages @> 'ruby=>2.0'") # And likes Programmer.where("languages -> 'ruby' LIKE '%1.9%'")
JSON
MySQL
The same as PostgreSQL
but controlled by an evil overlord
“One thing is for certain: there is no stopping them; the enterprise will soon be here. And I for one welcome our new oracular overlords. I’d like to remind them that as a trusted software engineer, I can be helpful in rounding up others to toil in their enterprise big data caves.”
Name | MySQL |
---|---|
Class | Relational |
Alignment | Consistent |
API | SQL over TCP/IP* |
Feats | Storage Engines | Everywhere! | HandlerSocket |
---|---|---|---|
Saving Throws |
|
See also
MariaDB
Percona Server (with XtraDB)
TokuDB
Dynamo/Db
The paper that launched a thousand at least 3 ships databases
Name | DynamoDb |
---|---|
Class | Key/Value |
Alignment | Available, "Strongly" Consistent |
API | JSON over HTTP |
Feats | Predictable Latency |
Pricing |
Scale |
---|---|---|---|
Saving Throws |
|
Links
Riak
engineering ftw
Artist's impression
Actual results may vary
Name | Riak |
---|---|
Class | Key/Value |
Alignment | Available, Tunable Consistency |
API | REST/Protocol Buffers/Erlang |
Feats | Clustering |
Tunable
|
Cloud Storage
|
---|---|---|---|
Saving Throws |
|
Consistent Hashing
and
Vritual Nodes
Simple API
# Read an Object
GET /buckets/{bucket}/keys/{key}
# Store an Object with Generated Key
POST /buckets/{bucket}/keys
# Store an Object with Specified Key
PUT /buckets/{bucket}/keys/{key}
# Counters (CRDT)
POST /buckets/{bucket}/counters/{key}
Ripple
require 'ripple'
# Documents are stored as JSON objects in Riak but have rich
# semantics, including validations and associations.
class Email
include Ripple::Document
property :from, String, :presence => true
property :to, String, :presence => true
property :sent, Time, :default => proc { Time.now }
property :body, String
end
email = Email.find("37458abc752f8413e") # GET /riak/emails/37458abc752f8413e
email.from = "toby@ninefold.com"
email.save # PUT /riak/emails/37458abc752f8413e
Vector Clocks
How do they even?
Links
BigTable
something something dark side
Name | BigTable |
---|---|
Class | Column |
Alignment | Availability |
API | Proprietary |
Feats | App Engine |
Massive |
|
---|---|---|---|
Saving Throws |
|
Largest BigTable Cluster
- 100s PB data
- sustained: 30M ops/sec
- 100+ GB/s I/O
Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner.
GPS
and an
Atomic Clock
Cassandra
more columns than a greek architecture
eventually, it's awesome
Eventually Persistent
Name | Cassandra |
---|---|
Class | Column |
Alignment
|
Available, Tunable Consistency
|
API | CQL / Thrift |
Feats | Distributed |
Write Performance |
CQL |
---|---|---|---|
Saving Throws |
|
Title
The smallest increment of data, which contains a name, a value and a timestamp.
Row
Columns that have the same primary key.
Primary Key
The partition key. One or more columns that uniquely identify a row in a table
Table Column Family
A collection of ordered (by name) columns fetched by row. A row consists of columns and have a primary key. The first part of the key is a column name. Subsequent parts of a compound key are other column names that define the order of columns in the table.
CREATE TABLE playlists (
id uuid,
song_order int,
song_id uuid,
title text,
album text,
artist text,
PRIMARY KEY (id, song_order )
);
INSERT INTO playlists (id, song_order, title, artist, album)
VALUES (8a172618-b121-4136-bb10-f665cfc469eb, 1, 'Elevator of the Machine', 'Mount Fuji Doomjazz Corporation', 'Egor');
INSERT INTO playlists (id, song_order, title, artist, album)
VALUES (8a172618-b121-4136-bb10-f665cfc469eb, 2, 'Knock by the Stairs', 'Mount Fuji Doomjazz Corporation', 'Egor');
INSERT INTO playlists (id, song_order, title, artist, album)
VALUES (8a172618-b121-4136-bb10-f665cfc469eb, 3, 'Oktober', 'Bersarin Quartet', 'Bersarin Quartet');
Links
Memcache
The simplest thing that can possibly work
I'm a cache, not a database!
Name | Memcache |
---|---|
Class | Key-Value |
Alignment |
Cache |
API | "Memcache" over TCP |
Feats |
Ubiquity |
Simplicity | Workableness |
---|---|---|---|
Saving Throws |
+1 Memcache as Standard
|
Simple
- set
- add
- replace
- append
- prepend
- cas
- get
- gets
- delete
-
incr/decr
Client Server
Client controls hash
Server handles data
No cluster state
Rails Fragment Cache
config.action_controller.perform_caching = true
config.cache_store = :mem_cache_store, "1.cache.com", "2.cache.com"
<% cache(cache_key_for_products) do %>
All available products:
<% Product.all.each do |p| %>
<% cache(p) do %>
<%= link_to p.name, product_url(p) %>
<% end %>
<% end %>
<% end %>
Rails with IdentityCache
class Product < ActiveRecord::Base
include IdentityCache
has_many :images
cache_has_many :images, :embed => true
end
@product = Product.fetch(id)
@images = @product.fetch_images
Redis
the queue you have when you willfully refuse to admit you need a queue
Name | Redis |
---|---|
Class | Key/Value |
Alignment |
Consistent, mostly |
API | Plain-Text over TCP |
Feats |
Versatile |
Fast |
Queueing |
---|---|---|---|
Saving Throws |
|
Data Structures
# strings
SET akey "Hello"
GET akey
# hashes
HSET anhash greeting "Hello"
HGET anhash greeting
# lists
RPUSH alist "hello"
RPUSH alist "world"
LRANGE alist 0 -1
# sets
# sorted sets
# pub/sub
Optimistic Locking
MULTI
INCR foo
INCR bar
EXEC
MULTI, EXEC, DISCARD and WATCH
Check and Set
WATCH mykey
val = GET mykey
val = val + 1
MULTI
SET mykey $val
EXEC
Resque
class ImageConversionJob
def work
# convert some kind of image here
end
end
resque = Resque.new
resque << ImageConversionJob.new
Gratuitous Aside
Lua!
OpenResty!
Lua Scripting
EVAL "local result = redis.call('ZREVRANGE', 'myset', 0, 10, 'WITHSCORES');
redis.call('ZREMRANGEBYRANK','myset',0, 9);
return result;" 0
I've just reinvented stored procedures!
Also
Maybe use a queue
RabbitMQ
Kafka
Neo4j
All your graphs are belong to us
Name | Neo4j |
---|---|
Class | Graph |
Alignment
|
Consistent |
API | REST, Java |
Feats | Acid | O/R Impedance | Cypher |
---|---|---|---|
Saving Throws |
|
Almost, but not quite, entirely unlike a RDBMS
Optimized for highly connected data
(rather than aggregated data)
MATCH (m:Movie {title:"Star Wars"})<-[:ACTS_IN]-(actor)
RETURN actor;
Links
(Totes fun)
Couch
Awesome for data you don't want to easily get to ever again but you want it replicated everywhere
Name | CouchDb |
---|---|
Class | Document |
Alignment
|
Available |
API | REST, Java |
Feats | MVCC | Master-Master | WebApps |
---|---|---|---|
Saving Throws |
|
MongoDB
Webscale
If you think of it as Redis for JSON, you're probably ok
Name | MongoDB |
---|---|
Class | Document (BSON) |
Skills | Consistent-ish |
API | BSON JS Query Language |
Feats | Minimal Impedance | WebScale | Sharding |
---|---|---|---|
Saving Throws |
|
To recap: MongoDB is neither AP nor CP. The defaults can cause significant loss of acknowledged writes. The strongest consistency offered has bugs which cause false acknowledgements, and even if they're fixed, doesn't prevent false failures.
Consistency and fault-tolerance are hard. Genuinely hard. The devil is always in the details, and it takes a concentrated mind to navigate through these waters correctly
Mongoid
class Artist
include Mongoid::Document
field :name, type: String
embeds_many :instruments
end
class Instrument
include Mongoid::Document
field :name, type: String
embedded_in :artist
end
syd = Artist.where(name: "Syd Vicious").between(age: 18..25).first
syd.instruments.create(name: "Bass")
syd.with(database: "bands", session: "backup").save!
Totes from the docs
RethinkDB
MongoDB 2.0
Name | RethinkDB |
---|---|
Class | Document |
Skills | Consistent |
API | JSON |
Feats | Minimal Impedance |
Admin
|
Sharding |
---|---|---|---|
Saving Throws |
|
The commercial fringe
Couchbase
Aerospike
FoundationDB
MemSQL
Hyperdex
You had me at HyperSpace Hashing
Name | Hyperdex |
---|---|
Class | Key/Value |
Skills | Consistent |
API | C |
Feats | HyperSpace | Transactions | Scale |
---|---|---|---|
Saving Throws |
|
Hyperspace hashing deterministically maps objects to servers to enable efficient object insertion and retrieval. But it differs from these techniques because it takes into account the secondary attributes of an object when determining the mapping for an object. Specifcally, it maps objects to coordinates in a multi-dimensional Euclidean space - a hyperspace - which has axes defined by the objects' attributes. Each server in the system is mapped onto a region of the same hyperspace, and owns the objects that fall within its region. Clients use this mapping to deterministically insert, remove, and search for objects.
require "hyperdex"
client = HyperDex::Client::Client.new("127.0.0.1", 1982)
client.put(:space, 'hello', {:v => "Hello World!"})
puts client.get(:space, 'hello')
client.atomic_add(:space, 'votes', {upvotes: 1})
current_balance = client.get('accounts', 'account-number')['balance']
raise InsufficientFunds if current_balance < amount
balance = mybalance - amount
# cond_put(spacename, key, predicates, attributes)
success = client.cond_put('accounts', 'account-number', {balance: current_balance}, {balance: balance})
HBase
Hadoop
for the data problem you wish you had
Name | HBase |
---|---|
Class | Column |
Skills | Consistent |
API | ****** |
Feats | BIG | REALLY BIG |
|
---|---|---|---|
Saving Throws |
-1 Gartner Magic Quandrant
|
- Hadoop
- Hadoop Distributed File System
- Zookeeper
- Hive
- HiveQL
- Mahout
- Map/Reduce
Small-to-medium Data
In 2010 Facebook claimed that they had the largest Hadoop cluster in the world with 21 PB of storage.
On July 27, 2011 they announced the data had grown to 30 PB.
On June 13, 2012 they announced the data had grown to 100 PB.
On November 8, 2012 they announced the warehouse grows by roughly half a PB per day.
Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.
Berkeley Data Analytics Stack
But if you wanna appear really cool and underground then I reckon the next big thing is the Berkeley Data Analytics Stack. They are making really awesome shit.
UnQLite
NoSQLite
Name | UnQLite |
---|---|
Class | Document & Key/value |
Skills | N/A |
API | C/C++ |
Feats | Embedded | Single File | Jx9 |
---|---|---|---|
Saving Throws |
|
Elasticsearch
Oh why can't I make you my datastore?
Name | ElasticSearch |
---|---|
Class | Document |
Alignment | Consistent |
API |
RESTful JSON |
Feats | Facets | River | Percolators |
---|---|---|---|
Saving Throws |
|
[C]\ ,-[AMQP] \ | [C]--[LB]---[AMQP] / | | [C]/ | '-[AMQP] | [LS]---[ES] [LS] [ES]
[ES]
Tire
s = Tire.search(search_index) do
filter :range, {
"@timestamp" => {
"from" => from.iso8601,
"to" => to.iso8601
}
}
query {
string "@tags:unicorn"
}
end
s.results
{
"filter": {
"and": [
{
"range": {
"@timestamp": {
"from": "2013-09-01T00:00:00+00:00",
"to": "2013-09-26T23:59:59+00:00"
}
}
},
{
"terms": {
"@tags": ["error", "migration"],
"execution": "and"
}
},
{
"not": {
"filter": {
"term": { "@source": "client.log" }
}
}
}
]
},
"sort": [
{
"@timestamp": "desc"
}
]
}
curl -XPUT 'http://localhost:9200/events/event' -d '{
"event": {
"user" : "toby",
"post_date" : "2014-02-18T20:00:00",
"event" : "page_view"
}
}'
GET _search
{
"query" : {
"term" : { "user" : "toby" }
},
"facets" : {
"event" : {
"terms" : {"field" : "event"},
"facet_filter" : {
"term" : { "user" : "toby" }
},
"global": true
}
}
}
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"facets": {
"event": {
"_type": "terms",
"missing": 0,
"total": 5,
"other": 0,
"terms": [
{
"term": "page_view",
"count": 5
}
]
}
}
}
Links
Nailed it
The End
@tobyhede
Come say hello
Bohconf Room
An ode to 17 database in 33 minutes - RailsConf 2014
By tobyhede
An ode to 17 database in 33 minutes - RailsConf 2014
- 17,913