Banyan: Coordination-free Distributed Transactions over Mergeable Types

Synopsis by

Shashank Shekhar Dubey

Guide: Dr. KC Sivaramakrishnan

RISE Lab

Dept. of Computer Science & Engineering

IIT Madras

  • Brief discussion on Banyan: The problem in hand and the proposed solution
  • Operational semantics

Why build distributed systems?

  • Scalability
  • Fault-tolerance
  • High availability
  • High throughput
  • Low latency

Why build distributed systems?

Why build distributed systems?

Latency

Latency

Latency

Problem with eventual consistency
  • Conflicts while merging replicas
Mergeable Replicated Data Types (MRDT)
  • Distributed variant of ordinary data types
  • Inbuilt ability to reconcile conflicts
Challenges:
  • Data distribution
  • Recursive merge
  • Storage requirement
Operational Semantics
Banyan
Operational Semantics
  • Core language => Operations
  • Mathematical rules => Transactions
Operational Semantics
  • Understanding of system
  • Verification
Programming Model
Programming Model

Public Branch

Private Branch

  • Isolated R/W operations at each private branch
    
  • Remote Refresh
  • Publish
  • Refresh
    
Banyan objects:
Blob
Tree
Commit
Key-Value : [a;b;c] -> v
Initial state:

a

lb

V1

C1

write B1 [a] v1:
B1
Tag Store
Block Store
write B1 [a] v1
write B1 [b;c] v2:
write B1 [b;d] v3:
Garbage Collection
Garbage Collection
Cassandra : 4.9 MB

Banyan : 1.8 GB

376 x

Space usage: Cassandra vs Banyan
Garbage Collection
  • Usual approach : Node reachability
  • Our approach : Node accessibility
Bugs found in Irmin
  • Prefering shorter path when merging conflicting updates
  • Non-commutative merge in case of modify/delete conflict
Thank you