Let's talk about how to design, the principles, architecture and decisions
Introduce the core technologies for TiKV
Consensus Algorithm: Raft
New language: Go & Rust
Hmm... let's go back to our real life
Why another database?
Relational Database is hard to provider better performance, hard to scale, although we have a lot of shading solutions (MySQL proxy) but distributed transactions and cross node join are not supported
NoSQL is easy to scale, but don't support SQL well and consistent transactions
Papers about Google Spanner and F1 describe a new database: NoSQL like and maintains ACID transcations
TiDB and CockroachDB are NewSQL implementations
What kind of database we want to use?
It should support SQL of course
Must easy to scale, support fail-over and load balance
ACID transactions, we want strong consistency guarantee to help us write code easily
We want the schema changes, secondary index and migrations are easy to do
Highly available always is important to us
What is TiDB?
An open source NewSQL distributed database from PingCAP
BTW I love this name
Support MySQL protocol and support different persistence engine
Inspired by Google Spanner & F1
Based on K-V and support Facebook etcd
Created by Go & Rust
Distributed, consistent, scalable, SQL database
The principles or the philosophy
No data lost and system can automatically recover
Easy to use, easy to migrate data
On premise, cloud or container
Easy to dev and maintain - loose coupling
SQL layer
K-V layer
The logical architecture
The logical architecture
Transcations
Inspired by Google's Percolator
Two-phase commit protocol with timestamp allocator
3 columns:
lock: uncommitted trans write this cell and contains primary lock
write: stores the timestamp
data: store the data
Transcations
Transcations
MVCC - Multiversion concurrency control
Each transaction sees a snapshot of the database at the beginning time of this transaction. Any changes made by this transaction will not be seen by other transactions until the transaction is committed.
Data is tagged with versions in the following format: Key_version: value.
MVCC also ensures Lock-free snapshot reads.
Raft
Byzantine Generals Problem
We send data to server1, server2 and server 3 ...
If we can make server1 as the port and server2, server 3 just follow it