Banyan: Coordination-free Distributed Transactions over Mergeable Types

MS Seminar by

Shashank Shekhar Dubey

Guide: Dr. KC Sivaramakrishnan

RISE Lab

Dept. of Computer Science & Engineering

IIT Madras

  • Course work details
  • Banyan
    • Motivation
    • Design and implementation
    • Performance
    • Irmin Data Model
    • Garbage Collection
    • Recursive Merges
    • Operational Semantics
  • Future Work
  • Work Timeline
Index
Course work details:
S.No. Course No. Course Title Core/Elective Credits Grade
1 CS5800 Advanced Data Structures and Algorithms Core 12 B
2 CS6440 Distributed Computing Core 12 B
3 CS6847 Cloud Computing Core 12 A
4 ID6020 Introduction to Research (Institute model) Core 0 P
5 CS6021 Introduction to Research Core 0 P
6 CS6570 Secure System Engineering Elective 12 C
7 CS6530 Applied Cryptography Elective 12 C

Banyan: Coordination-free Distributed Transactions over Mergeable Types

Coordination-free Distributed Transaction over Mergeable Types
Coordination-free Distributed Transaction over Mergeable Types

 

Balance between availability and coordination
 

Eventually Consistent Database
Resolving conflicts to converge Data types with support for conflict resolution
Coordination-free Distributed Transaction over Mergeable Types

Context based convergence

Mergeable Replicated Data Types

Kaki, Gowtham, et al. "Mergeable replicated data types." Proceedings of the ACM on Programming Languages 3.OOPSLA (2019): 1-29.
Replica1
Replica2
0
0
1
2
12
3
7
4
6
clone
+1
+2
+2
+4
Counter.merge 0 2 1
Counter.merge 1 6 7
+3
Coordination-free Distributed Transaction over Mergeable Types
Eventual Consistency

Key 1

Key 2

Value 1

Value 2

Coordination-free Distributed Transaction over Mergeable Types
Eventual Consistency

Key 12

Value 1

Value 2

Banyan
  • Programming model
  • Requirement: Eventually Consistent DBs
  • Our implementation: OCaml, Cassandra
Programming Model
Programming Model

Public Branch

Private Branch

  • Isolated transactions at each private branch
    
  • Remote Refresh
  • Publish
  • Refresh
    
  • Loss of connection does not affect local R/W operations
    
  • Always available
    
Realizing the Programming Model
Banyan

Eventually Consistent Distributed Database

Implementation details...

Realizing the Programming Model
Banyan
So, How does it work
?
D1

building library

Artifact
D2

building library

Artifact
opam install git
opam install git
D1

building library

Artifact

banyan build system

read

N/A

Key Value

write

private branch of D1

opam install git
D1
opam install git

building library

Artifact

banyan build system

read

N/A

Key Value
/git/1.0/git.cmx 0xAB12...

write

private branch of D1

publish

Key Value
D1_private
Key Value
public
Key Value
D2_private
Key Value
/git/1.0/git.cmx 0xAB12...

write

Key Value
/git/1.0/git.cmx 0xAB12...

publish

Key Value
/git/1.0/git.cmx 0xAB12...

refresh

remote refresh

connect

D2
Artifact

banyan build system

read

Artifacts

Key Value
/git/1.0/git.cmx 0xAB12...
opam install git
Library requested Dependent libraries
Git OCaml
dune
uri
lwt
Library requested Dependent libraries
Irmin OCaml
dune
uri
jsonm
2. Few dependencies are         found in cache
1. Library found in cache

*Cassandra cluster with four replicas within the same data center; inter-node latency ~ 0.5ms

Irmin Data Model
Irmin Data Model
Session S0
Session S1

c0

c1

/

foo

v0

Tag Store
Block Store
S0 : Write foo v0
S1 : Write foo v0
(Read/Write Store)
(Append-Only Store)
Session S0

c0

/

foo

v0

Tag Store
Block Store
Blob node
Tree node
Commit node
Garbage Collection
Cassandra : 4.9 MB

Banyan : 1.8 GB

376 x

Space usage: Cassandra vs Banyan
GIT: 
Unreachable objects
GC Candidates
Banyan: 
Commit nodes not required             for LCA computation
GC Candidates
  • s0-c0
  • p0-c0
  • s1-c0
Recursive Merges
Concurrent Criss-Cross merges
1
2
3
4
+3
+1
Recursive Merges
Operational Semantics

Commands

Local branch

Public branch

Blockstore

Tagstore

Operational Semantics
Operational Semantics
Future Work 
  • Complete type setting the operational semantics
  • Implementing the garbage collection
  • Conference Paper Acceptance: Aug, 2020
    
    
  • Extension of conference work 
    for journal: Aug, 2020 - Apr, 2021
    
    
  • GTC Meeting and seminar: April, 2021
    
    
  • Finishing task for Journal Submission: May, 2021
    
    
  • Synopsis: May, 2021
    
    
  • Thesis: June, 2021
Work Timeline
Thank you