The XML Summer School
eXist-db - Core Developer (13 years!)
Native XML Database
Implemented in Java
Open Source: LGPL v2.1
RocksDB - Developer (4 years)
Key / Value Database
Implemented in C++ (and Java API)
Open Source: GPL v2 / Apache 2.0
Granite - Developer (4 years)
Polystore: XML, Key/Value (JSON, MarkDown... DOM)
Implemented in C++ and Java
Will be Open Source: likely AGPL v3
General Electric - IDS (Integrated Data Store)
Possibly the first DBMS
IBM - IMS (Information Management System)
Developed for the Apollo moon mission (purchasing)
Programmer defined physical storage (Hash / Tree / etc.)
Determines the API you can use to query
Ted Codd (IBM)
Avoid rewritting applications for every schema change
Need more abstraction
Logical vs. Physical
Let the database engine worry about physical storage
Let the user query their logical model
Query through a high-level language
The Relational Data Model is born
System R (Jm Gray - IBM)
INGRES (Michael Stonebraker - U.C. Berkeley)
Oracle (Larry Ellison)
Mostly Improvements to the Relational Model
SQL is The standard - 1986 ANSI
Further notable implementations:
Informix (1981 / SQL 1985)
IBM DB2 (1983)
Partnership created Microsoft SQL Server (1989)
Stonebraker - Post-Ingres
New: Object Database (1985) / Object-Relational hybrids
1994 - Berkley shutters Postgres
Released as Open Source under MIT
Forked as Postgres95 (later PostgreSQL)
Open Source rewrite of mSQL
The Web Takes off!
The rise of the LAMP stack!
1995 - 16 million users (0.4% world pop.)
1999 - 248 million users (4.1% world pop.)
Cap Theorem (Eric Brewer - 1999)
The Web (2009)
Reaches 1,802 million users (26.6% world pop.)
Big Web Companies:
Commercial databases are too expensive and don't scale
Open Source databases lack features
Each building middle-ware to distribute load, e.g.:
eBay and Amazon - Oracle
Facebook - MySQL
Start building their own DBMS:
Google - BigTable/LevelDB (2004), Spanner (2012)
Amazon - DynamoDB (2012)
Facebook - 6 billion photos a month / 100 petabytes (2012)
Google - 40,000 searches per second (2014)
The NoSQL "Movement"
Not SQL => Not (only) SQL
Rejects classic DMBS in favour of lighter faster storage
Compromises - Consistency, Availability, Durability vs. Performance
Full-circle. The SQL vendors fought back! - NewSQL
RAM is cheap
SSD / NVMe / RDMA
GPU and FPGA
Facebook's Open Source LevelDB fork... for SSD/NVMe etc.
Powers almost everything at Facebook (and others)
Used in: ArangoDB, Cassandra, CockRoach DB, MongoRocks (MongoDB), MyRocks (MySQL), many more...
Database core is in-memory and GPU optimized
Optimized for data analytics
Open Source. Distributed database.
Cassandra compatible implementation in C++
2x - 10x faster than Cassandra
Optimised for multi-threaded machines. Clustering also.
Previously closed source, now Open Source (under Apple)
Designed for performance (after durability)
Compromises - Transaction lifetime
Microsoft Open Source
Embedded key/value store
Impressive performance "claims"
Memory / Persistent disk is now blurred (NVMe etc).
Custom Hardware - ASIC, FPGA, RDMA Network etc.
Consistency is back in vogue.
Likely SQL (or similar) for the user.
Distributed. Sharding. Clustered.
Node failure happens! Data centre failure happens!
Common Core, e.g.: RocksDB.
Polystore vs. Multiple databases
CMU Advanced Database Systems
...The YouTube Videos are excellent!