Banyan: Coordination-free Distributed Transactions over Mergeable Types
Synopsis by
Shashank Shekhar Dubey
Guide: Dr. KC Sivaramakrishnan
RISE Lab
Dept. of Computer Science & Engineering
IIT Madras


-
Brief discussion on Banyan: The problem in hand and the proposed solution
-
Operational semantics
Why build distributed systems?
-
Scalability
-
Fault-tolerance
-
High availability
-
High throughput
-
Low latency

Why build distributed systems?



Why build distributed systems?
Latency
Latency
Latency
Problem with eventual consistency
-
Conflicts while merging replicas
Mergeable Replicated Data Types (MRDT)
-
Distributed variant of ordinary data types
-
Inbuilt ability to reconcile conflicts

Challenges:
-
Data distribution


-
Recursive merge
-
Storage requirement
Operational Semantics
Banyan



Operational Semantics
-
Core language => Operations
-
Mathematical rules => Transactions
Operational Semantics
-
Understanding of system
-
Verification
Programming Model
Programming Model












Public Branch
Private Branch


-
Isolated R/W operations at each private branch
-
Remote Refresh
-
Publish
-
Refresh
















Banyan objects:
Blob
Tree
Commit
Key-Value : [a;b;c] -> v

Initial state:
a
lb
V1
C1
write B1 [a] v1:
B1
Tag Store
Block Store

write B1 [a] v1

write B1 [b;c] v2:

write B1 [b;d] v3:


Garbage Collection
Garbage Collection
Cassandra : 4.9 MB
Banyan : 1.8 GB
376 x
Space usage: Cassandra vs Banyan
Garbage Collection
-
Usual approach : Node reachability
-
Our approach : Node accessibility


Bugs found in Irmin
-
Prefering shorter path when merging conflicting updates
-
Non-commutative merge in case of modify/delete conflict
Thank you
Synopsis
By Shashank shekhar Dubey
Synopsis
- 159