The Merkle Tree

Building Block of the Decentralized Web

Merkle
Tree
  • I work at Carbon Five in Los Angeles
  • We will program your computer for you
  • I'm working on a P2P project for PL

All About Me

All About Ralph

  • A computer scientist
  • Invented cryptographic hashing
  • Co-invented public key encryption
  • Invented and patented Merkle "hash" tree in 1979

Who is Ralph Merkle?

  • A computer scientist
  • Invented cryptographic hashing
  • Co-invented public key encryption
  • Invented and patented Merkle "hash" tree in 1979
  • Patent expired in 2002

HODL GANG

  • A one-way, collision-resistant function which maps arbitrary number of bytes to a fixed-length bit string
     
    • collision-resistant:
      • difficult to find two inputs which map to same output
         
    • one way:
      • difficult to compute input from output

Cryptographic Hash

  • A one-way function which maps arbitrary number of bytes to a fixed-length bit array

What's a Hash?

2.4.1 :004 > Digest::SHA1.hexdigest 'out with the californians'
 => "63392a6bb1c162dc79c3fc62f7860f61505f262e"
  • A one-way function which maps arbitrary number of bytes to a fixed-length bit array

What's a Hash?

2.4.1 :004 > Digest::SHA1.hexdigest 'out with the californians'
 => "63392a6bb1c162dc79c3fc62f7860f61505f262e"

2.4.1 :006 > Digest::MD5.hexdigest 'they are driving up the rents'
 => "036d059f4ec024327107ff5976419714"
you already use this stuff
you are a scientist
  • A tree in which all leaves are the hash of some data (a "block")

What's a Merkle Tree?

  • A tree in which all leaves are the hash of some data (a "block")
  • All non-leaves are hashes of their children's hashes

What's a Merkle Tree?

Learning By Building

duvall.txt
8 bytes
1010101100101010010010010101001010101010101101010111010101101010
1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
duvall.txt
8 bytes

Learning By Building

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
duvall.txt
8 bytes

Learning By Building

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
0896
3bc7
duvall.txt
8 bytes

Learning By Building

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
duvall.txt
8 bytes

Learning By Building

Maintaining Balance

secret.txt
6 bytes
101010110010101001001001010100101010101010110101
101010110010101001001001010100101010101010110101
secret.txt
6 bytes
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
7a9e

Maintaining Balance

101010110010101001001001010100101010101010110101
secret.txt
6 bytes
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
???
7a9e
0896
???

Maintaining Balance

101010110010101001001001010100101010101010110101
secret.txt
6 bytes
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
7a93
7a9e
0896
bcc7

Maintaining Balance

101010110010101001001001010100101010101010110101
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
7a9e
7a9e
0896
bcc7
63d5

Maintaining Balance

secret.txt
6 bytes

Why Use Them?

xenu.txt
8 bytes
1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

Immutable

Immutable

xenu.txt
8 bytes
1010101100101010111111111111111110101010101101010111010101101010
1111111111111111
1010101010110101
1010101100101010
0100100101010010
2e79
fff6
7a9e
3bc7
????
????
????

Immutable

xenu.txt
8 bytes
1010101100101010111111111111111110101010101101010111010101101010
1111111111111111
1010101010110101
1010101100101010
0100100101010010
4402
2e79
fff6
7a9e
6988
3bc7
c6ff

Small Inclusion Proofs

magna.crta
32 bytes
efaf
accc
f8aa
4444
0101
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
magna.crta
32 bytes
efaf
accc
f8aa
4444
0101
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."

Small Inclusion Proofs

magna.crta
32 bytes
efaf
accc
f8aa
4444
0101
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."

Small Inclusion Proofs

magna.crta
32 bytes
efaf
accc
f8aa
4444
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
0101

Small Inclusion Proofs

magna.crta
32 bytes
efaf
accc
f8aa
4444
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
0101

Small Inclusion Proofs

magna.crta
32 bytes
efaf
accc
f8aa
4444
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
0101
Prover: "Sure. 078d, f8aa, ab33, abcd."

Small Inclusion Proofs

tree size: 31 hashes
proof size: 4 hashes

Tunable Properties

magna.crta
32 bytes
efaf
accc
f8aa
cd23
fda1
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
Prover: "Sure. 3918, f8ab, 078d, accc, cd23, fda1"
tree size: 21 hashes
proof size: 6 hashes

Real-World Uses

Cassandra DB Repair Protocol

uses a Merkle tree to synchronize replicas

owns tokens
41-80
owns tokens
81-120
owns tokens
121-160
owns tokens
161-200
owns tokens
0-40

Cassandra Distributed DB

Replication Factor N=2

owns tokens
0-40
replicated tokens
0-40
replicated tokens
0-40

Repair Initiated

owns tokens
0-40
replicated tokens
0-40
replicated tokens
0-40
repair coordinator

Merkleize Partitions

1111
1010
0001
0000
1100
1010
0001
0000
1111
1010
0001
0000
Replica A
Replica B
Replica C
repair coordinator
"gimme tree!"
"gimme tree!"
"gimme tree!"
2c55
cbdd
0001
1b79
1111
1010
0001
0000
3d66
cbdd
0001
1b79
1100
1010
0001
0000
2c55
cbdd
0001
1b79
1111
1010
0001
0000
Replica A
Replica B
Replica C
repair coordinator

Merkleize Partitions

2c55
cbdd
0001
1b79
999b
3bc7
1111
1010
0001
0000
3d66
cbdd
0001
1b79
999b
4bbb
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
1111
1010
0001
0000
Replica A
Replica B
Replica C
repair coordinator

Merkleize Partitions

Consistency Check

2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica A
Replica B
Replica C
repair coordinator

Quorum

2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica A
Replica B
Replica C
repair coordinator

Mismatch Detected

3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C

Mismatch Detected

3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C

Mismatch Detected

3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C

Mismatch Detected

3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C

Bad Block Found!

3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C

Repair Complete

2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C

BitTorrent Content Verification

uses a Merkle tree to authenticate content from untrusted sources

BitTorrent (BEP-30)

searches for: warez.exe
Chester

BitTorrent (BEP-30)

searches for: warez.exe
hash: 63d5
pieces: 4
sends:
Chester

trusted!

BitTorrent (BEP-30)

searches for: warez.exe
broadcast: want warez.exe[0]
hash: 63d5
pieces: 4
sends:
Chester
Chester
Paula
Patricia
Pedro
Pat

BitTorrent (BEP-30)

searches for: warez.exe
broadcast: want warez.exe[0]
hash: 63d5
pieces: 4
sends:
warez[0]+proof
Chester
Chester
Paula
Patricia
Pedro
Pat
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
????
????
????
????
????
????
ef43

BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent
doesn't have these blocks!
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
????
cbdd
????
????
????
????
ef43

BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
1b79
cbdd
????
????
999b
????
????

BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
1b79
cbdd
????
????
999b
3bc7
ef43

BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
1b79
cbdd
????
????
999b
3bc7
ef43

BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent

Storj Proof-of-Retrievability

uses a Merkle tree to prove someone is storing a file

xenu.txt
1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010

Shard File

Client
1010101100101010010010010101001010101010101101010111010101101010
1010101010110101

Chooses a Shard

xenu.txt[0]
Client
1010101100101010010010010101001010101010101101010111010101101010
1010101010110101

Generate Salts

S0: 7bd73c0ded23ae5bfae3f3dccc744d54d11a5655d6167232b1a795e633a06a2b
S1: 4a892b94bc214c1c7228105638c78ba82ac22d8e457e8737012aba46673b429f
S2: 4e0947df601b0412815fb38110b4c027ed6f1da4da973b8b7e7f9fad9eb5af5d
salts
xenu.txt[0]
Client
xenu.txt
shard 0

Concatenate Salt + Data

011010101010110101
111010101010110101
000000000000000000
001010101010110101
Sₓ|data
Client
salts:
S0, S1, S2
xenu.txt
shard 0

Hash

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
Sₓ|data
H(H(Sₓ|data))
leaves
Client
salts:
S0, S1, S2
xenu.txt
shard 0

Build Merkle Tree

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
Sₓ|data
H(H(Sₓ|data))
Client
salts:
S0, S1, S2
xenu.txt
shard 0

Build Merkle Tree

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Sₓ|data
H(H(Sₓ|data))
Client
salts:
S0, S1, S2
xenu.txt
shard 0

Store Salts & Merkle Root

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Sₓ|data
H(H(Sₓ|data))
Client
salts:
S0, S1, S2
xenu.txt
shard 0

Send Data + Leaves

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Sₓ|data
H(H(Sₓ|data))
Client
salts:
S0, S1, S2

Who Has What?

Storage Vendor
Client
  • Merkle root
  • Salts
  • Shard data
  • Merkle leaves

A Challenge!

  • Merkle root
  • Salts
  • Shard data
  • Merkle leaves
Storage Vendor
Client

Receives Challenge

Storage Vendor
xenu.txt
shard 0
salt:
S0

Prepend Salt

111010101010110101
Storage Vendor
xenu.txt
shard 0
salt:
S0

Compute Leaf

111010101010110101
2e79
Storage Vendor

Find Offset in Leafset

1b79
2e79
fff6
7a9e
0896
3bc7
63d5
111010101010110101
2e79
Storage Vendor

Transmit Proof

1b79
2e79
fff6
7a9e
0896
3bc7
63d5
2e79, 1b79, 3bc7, 63d5
Sends:
Storage Vendor

Receive Response

Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5

Validate Proof

1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5

Validate Proof

1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5

Validate Proof

1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5

A Match!

1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5

In Summary

  • It's just a data structure
  • Tunable properties for space / proof size
  • Used by lots of P2P projects
  • Get out there and experiment!

Links

Thanks!

  • Check out a reference implementation:

    github.com/laser/go-merkle-tree

CHADEV: Merkle Tree

By laser

CHADEV: Merkle Tree

  • 7