The Merkle Tree
Building Block of the Decentralized Web


Merkle
Tree
- I work at Carbon Five in Los Angeles
- We will program your computer for you
- I'm working on a P2P project for PL
All About Me


All About Ralph
- A computer scientist
- Invented cryptographic hashing
- Co-invented public key encryption
- Invented and patented Merkle "hash" tree in 1979
Who is Ralph Merkle?
- A computer scientist
- Invented cryptographic hashing
- Co-invented public key encryption
- Invented and patented Merkle "hash" tree in 1979
- Patent expired in 2002
HODL GANG





- A one-way, collision-resistant function which maps arbitrary number of bytes to a fixed-length bit string
- collision-resistant:
- difficult to find two inputs which map to same output
- difficult to find two inputs which map to same output
- one way:
- difficult to compute input from output
- collision-resistant:
Cryptographic Hash
- A one-way function which maps arbitrary number of bytes to a fixed-length bit array
What's a Hash?
2.4.1 :004 > Digest::SHA1.hexdigest 'out with the californians'
=> "63392a6bb1c162dc79c3fc62f7860f61505f262e"
- A one-way function which maps arbitrary number of bytes to a fixed-length bit array
What's a Hash?
2.4.1 :004 > Digest::SHA1.hexdigest 'out with the californians'
=> "63392a6bb1c162dc79c3fc62f7860f61505f262e"
2.4.1 :006 > Digest::MD5.hexdigest 'they are driving up the rents'
=> "036d059f4ec024327107ff5976419714"
you already use this stuff
you are a scientist
- A tree in which all leaves are the hash of some data (a "block")
What's a Merkle Tree?
- A tree in which all leaves are the hash of some data (a "block")
- All non-leaves are hashes of their children's hashes
What's a Merkle Tree?
Learning By Building

duvall.txt 8 bytes
1010101100101010010010010101001010101010101101010111010101101010

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
duvall.txt 8 bytes
Learning By Building

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
duvall.txt 8 bytes
Learning By Building

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
0896
3bc7
duvall.txt 8 bytes
Learning By Building

1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
duvall.txt 8 bytes
Learning By Building
Maintaining Balance

secret.txt 6 bytes
101010110010101001001001010100101010101010110101

101010110010101001001001010100101010101010110101
secret.txt 6 bytes
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
7a9e
Maintaining Balance

101010110010101001001001010100101010101010110101
secret.txt 6 bytes
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
???
7a9e
0896
???
Maintaining Balance

101010110010101001001001010100101010101010110101
secret.txt 6 bytes
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
7a93
7a9e
0896
bcc7
Maintaining Balance
101010110010101001001001010100101010101010110101
0110101001101010
1010101010110101
0100100101010010
1b79
2e79
7a9e
7a9e
0896
bcc7
63d5
Maintaining Balance

secret.txt 6 bytes
Why Use Them?

xenu.txt 8 bytes
1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Immutable
Immutable

xenu.txt 8 bytes
1010101100101010111111111111111110101010101101010111010101101010
1111111111111111
1010101010110101
1010101100101010
0100100101010010
2e79
fff6
7a9e
3bc7
????
????
????
Immutable

xenu.txt 8 bytes
1010101100101010111111111111111110101010101101010111010101101010
1111111111111111
1010101010110101
1010101100101010
0100100101010010
4402
2e79
fff6
7a9e
6988
3bc7
c6ff
Small Inclusion Proofs

magna.crta 32 bytes
efaf
accc
f8aa
4444
0101
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."

magna.crta 32 bytes
efaf
accc
f8aa
4444
0101
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
Small Inclusion Proofs

magna.crta 32 bytes
efaf
accc
f8aa
4444
0101
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
Small Inclusion Proofs

magna.crta 32 bytes
efaf
accc
f8aa
4444
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
0101
Small Inclusion Proofs

magna.crta 32 bytes
efaf
accc
f8aa
4444
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
0101
Small Inclusion Proofs

magna.crta 32 bytes
efaf
accc
f8aa
4444
a2fb
cd23
ab00
fda1
12a1
ccc9
ab33
0021
97e7
abcd
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
0101
Prover: "Sure. 078d, f8aa, ab33, abcd."
Small Inclusion Proofs
tree size: 31 hashes
proof size: 4 hashes
Tunable Properties

magna.crta 32 bytes
efaf
accc
f8aa
cd23
fda1
739d
3918
22a7
77ac
7b8c
2b5d
3232
8923
e382
f8ab
aaa3
078d
9ac9
fd27
12dd
f7c0
Challenger: "Prove to me that 77ac exists at offset 2 in the tree with hash efaf."
Prover: "Sure. 3918, f8ab, 078d, accc, cd23, fda1"
tree size: 21 hashes
proof size: 6 hashes
Real-World Uses
Cassandra DB Repair Protocol
uses a Merkle tree to synchronize replicas





owns tokens
41-80
owns tokens
81-120
owns tokens
121-160
owns tokens
161-200

owns tokens
0-40
Cassandra Distributed DB
Replication Factor N=2





owns tokens
0-40
replicated tokens
0-40
replicated tokens
0-40
Repair Initiated





owns tokens
0-40
replicated tokens
0-40
replicated tokens
0-40
repair coordinator
Merkleize Partitions
1111
1010
0001
0000
1100
1010
0001
0000
1111
1010
0001
0000
Replica A
Replica B
Replica C

repair coordinator
"gimme tree!"
"gimme tree!"
"gimme tree!"
2c55
cbdd
0001
1b79
1111
1010
0001
0000
3d66
cbdd
0001
1b79
1100
1010
0001
0000
2c55
cbdd
0001
1b79
1111
1010
0001
0000
Replica A
Replica B
Replica C

repair coordinator
Merkleize Partitions
2c55
cbdd
0001
1b79
999b
3bc7
1111
1010
0001
0000
3d66
cbdd
0001
1b79
999b
4bbb
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
1111
1010
0001
0000
Replica A
Replica B
Replica C

repair coordinator
Merkleize Partitions
Consistency Check
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica A
Replica B
Replica C

repair coordinator
Quorum
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica A
Replica B
Replica C

repair coordinator
Mismatch Detected
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C
Mismatch Detected
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C
Mismatch Detected
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C
Mismatch Detected
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C
Bad Block Found!
3d66
cbdd
0001
1b79
999b
4bbb
ff11
1100
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C
Repair Complete
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
2c55
cbdd
0001
1b79
999b
3bc7
ef43
1111
1010
0001
0000
Replica B
Replica C
BitTorrent Content Verification
uses a Merkle tree to authenticate content from untrusted sources

BitTorrent (BEP-30)


searches for: warez.exe
Chester
BitTorrent (BEP-30)


searches for: warez.exe

hash: 63d5
pieces: 4
sends:
Chester
trusted!
BitTorrent (BEP-30)


searches for: warez.exe


broadcast: want warez.exe[0]

hash: 63d5
pieces: 4
sends:



Chester
Chester
Paula
Patricia
Pedro
Pat
BitTorrent (BEP-30)


searches for: warez.exe


broadcast: want warez.exe[0]

hash: 63d5
pieces: 4
sends:



warez[0]+proof
Chester
Chester
Paula
Patricia
Pedro
Pat
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
????
????
????
????
????
????
ef43
BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent

doesn't have these blocks!
hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
????
cbdd
????
????
????
????
ef43
BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent

hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
1b79
cbdd
????
????
999b
????
????
BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent

hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
1b79
cbdd
????
????
999b
3bc7
ef43
BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent

hash: 63d5
pieces: 4
???????????????
000000000000
????????????????
????????????????
1b79
cbdd
????
????
999b
3bc7
ef43
BitTorrent (BEP-30)

Chester Had
block: [0x0, 0x0]
proof: [1b79, 3bc7]
Pat Sent

Storj Proof-of-Retrievability
uses a Merkle tree to prove someone is storing a file


xenu.txt
1010101100101010010010010101001010101010101101010111010101101010
0110101001101010
1010101010110101
1010101100101010
0100100101010010
Shard File

Client

1010101100101010010010010101001010101010101101010111010101101010
1010101010110101
Chooses a Shard
xenu.txt[0]

Client

1010101100101010010010010101001010101010101101010111010101101010
1010101010110101
Generate Salts
S0: 7bd73c0ded23ae5bfae3f3dccc744d54d11a5655d6167232b1a795e633a06a2b S1: 4a892b94bc214c1c7228105638c78ba82ac22d8e457e8737012aba46673b429f S2: 4e0947df601b0412815fb38110b4c027ed6f1da4da973b8b7e7f9fad9eb5af5d

salts
xenu.txt[0]

Client

xenu.txt shard 0
Concatenate Salt + Data

011010101010110101
111010101010110101
000000000000000000
001010101010110101
Sₓ|data

Client
salts: S0, S1, S2

xenu.txt shard 0
Hash

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
Sₓ|data
H(H(Sₓ|data))
leaves

Client
salts: S0, S1, S2

xenu.txt shard 0
Build Merkle Tree

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
Sₓ|data
H(H(Sₓ|data))

Client
salts: S0, S1, S2

xenu.txt shard 0
Build Merkle Tree

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Sₓ|data
H(H(Sₓ|data))

Client
salts: S0, S1, S2

xenu.txt shard 0
Store Salts & Merkle Root

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Sₓ|data
H(H(Sₓ|data))

Client
salts: S0, S1, S2

xenu.txt shard 0
Send Data + Leaves

011010101010110101
111010101010110101
000000000000000000
001010101010110101
1b79
2e79
fff6
7a9e
0896
3bc7
63d5
Sₓ|data
H(H(Sₓ|data))

Client
salts: S0, S1, S2
Who Has What?
Storage Vendor


Client
- Merkle root
- Salts
- Shard data
- Merkle leaves
A Challenge!


- Merkle root
- Salts
- Shard data
- Merkle leaves
Storage Vendor
Client
Receives Challenge

Storage Vendor

xenu.txt shard 0
salt: S0

Prepend Salt

111010101010110101
Storage Vendor

xenu.txt shard 0
salt: S0

Compute Leaf

111010101010110101
2e79
Storage Vendor
Find Offset in Leafset
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

111010101010110101
2e79
Storage Vendor
Transmit Proof
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

2e79, 1b79, 3bc7, 63d5
Sends:
Storage Vendor
Receive Response

Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5
Validate Proof
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5
Validate Proof
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5
Validate Proof
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5
A Match!
1b79
2e79
fff6
7a9e
0896
3bc7
63d5

Client
2e79, 1b79, 3bc7, 63d5
Receives:
Knows:
salt: S0, root: 63d5
In Summary
- It's just a data structure
- Tunable properties for space / proof size
- Used by lots of P2P projects
- Get out there and experiment!
Links
Understanding Merkle Trees: Why use them, who uses them, and how to use them
https://www.codeproject.com/Articles/1176140/Understanding-Merkle-Trees-Why-use-them-who-uses-t-
The Magic of the Merkle Tree
https://paulbellamy.com/2017/07/the-magic-of-the-merkle-tree
BitTorrent BEP30 (Merkle hash torrent extension)
http://bittorrent.org/beps/bep_0030.htmlBitcoin Proof of Space
https://bitcointalk.org/index.php?topic=310323.0Storj Proof of Retrievability
https://storj.io/storj.pdf
Thanks!
- Check out a reference implementation:
github.com/laser/go-merkle-tree
CHADEV: Merkle Tree
By laser
CHADEV: Merkle Tree
- 7