Merkle Bundles
Make Downloading JavaScript Cool Again
@liamzebedee | liamz.co
v2 - evolved from presenting at TBD Hackathon + Web3 Summit Berlin
Context
- 
	Distributing apps on the web is like sending the bible for every single church reading - 
		Bundles > 2MB 
 
- 
		
- 
	Recent innovations: - 
		Webpack 
- 
		Yarn / NPM united packages 
- 
		React 
- 
		Styled-components 
- 
		PWA’s 
 
- 
		
- 
	The future - 
		Edge computing 
- ?Blockchains?
 
- 
		
Current things
- 
	Bundle identifiers (2abd79e7d9fg.js) and c-hash-ing 
- 
	SplitChunks - 
		Static analysis of dependencies into a graph 
- 
		Split out into many <script src=> and load async 
 
- 
		
- 
	Minimization (FooBar -> a.b) 
- 
	Prefetch/preload 
- 
	HTTP2 Push 
Problems/Opportunities
Problems:
- 
	Small changes => more download => less battery life (mobile) 
- 
	Download a whole lot more than I need to 
- 
	We don’t really cache code on webpages, so reloads are always slow 
Oppportunity:
- 
	XKCD (web is instant apps) - AR etc. 
- 
	WebAssembly structured stack machine 
- 
	P2P CDN’s? 
Task
Deliver code for execution
- Minimize bandwidth
- Minimize communications overhead
- Same/negligible difference in processing power
Ideation
- 
	What if we downloaded only what changed? - 
		How? 
 
- 
		
- 
	Naive approach: delta - 
		Knowledge of \( C_t \) for \( 0..t \) 
- 
		Problem: How can we synchronise state without high bandwidth usage? 
 
- 
		
- 
	We already compare the hashes of assets, but this is for the entire file. Can we do it on a more granular scale 
Merkle Trees and AST’s
- 
	Merkle tree (popularised in blockchain world) - 
		A tree data structure 
- 
		Where every non-leaf node is hash of its children 
 
- 
		

Merkle Trees and AST’s
Compilers turn code into lower-level code. Example:
// Compile JSX (React syntax) to JS (vanilla JavaScript)
//
source = fs.readFile('something.jsx')
// export default const blah = () => {
//    return <div>Hello world</div>;
// }
compiled = runLoaders(src)
// module.exports = function blah() {
//    return React.createElement('div', { children: "Hello world" })
// }
Merkle Trees and AST’s
Compilation:
- lexed tokens (BRACE_START BRACE_END)
- abstract syntax tree
- bytecode
An AST

Merkle AST's
- Take the AST of code
- Generate a Merkle tree as a compact representation of our AST
What does that look like?
Merkle AST's - Example
function shout() {
    console.log("BLOCKCHAIN!");
}
// AST
Function [name=shout]
    Call [name=console.log]
        Literal ["BLOCKCHAIN!"]
// Merkle AST
42
    123
        543
Merkle AST's - Example
function shout() {
    console.log("BLOCKCHAIN!");
}
// AST
Function [name=shout]
    Call [name=console.log]
        Literal ["BLOCKCHAIN!"]
// Merkle AST
42
    123
        543
function shout() {
    console.log("BLOCKCHAIN!", "IOT!");
}
// AST
Function [name=shout]
    Call [name=console.log]
        Literal ["BLOCKCHAIN!"]
        Literal ["IOT!"]
// Merkle AST
86
    99
        543
        264
*changed
Merkle AST's - Example
Send client tree to server, compare the Merkle AST's
86*
    99*
        543
        264*Client
42
    123
        543Server
86
    99
        543
        264What's changed?
Merkle AST's - Example
Send client to server, compare the Merkle AST's
86*
    99*
        543
        264*Client
42
    123
        543Server
86
    99
        543
        264What's changed?



Merkle AST's - Example
We send only the diffs to the client

Merkle AST's - Example
We send only the diffs to the client

They patch their source
And run the bundle
The Smart CDN
- Client only has to send root hash of every <script> / bundle it includes for server to perform AST diff
- Server only has to keep:
	- 1) the merkle trees of all previous versions
- 2) the most recent source code in memory
 
Wishlist: deliver binary AST's

Some unintended benefits
- Code integrity (request by hash)
- Code deduplication:
	- you include:
		- 1) bootstrap.js, which includes JQuery
- 2) bundle.js, which includes JQuery
 
- any chunks with ID's common in the JQuery trees won't be sent... - only transmit code once
 
- you include:
		
- Code privacy - merkle AST can be simply a state sync primitive for proprietary code
Vision: A truely evolving platform
Let's cache things inversely proportional to how often they're required
e.g. React is loaded almost every day by FB/Netflix, so it never escapes the cache
Why? Web is the biggest distributed consensus-driven open platform we have.
Let's make mobile web apps INSTANT to load.
Crazier ideas...
- BitTorrent NPM that never goes down
- Download dApps from others without Internet
	- ENS => hash of app root code
- imagine distributing revolutionary software via AirDrop
 
- Build a Token Curated Registry adblocker based on the ID of code chunks
	- Market coordinated force against shitty web advertising
 
We could...
- Download dApps from others without Internet
	- ENS => hash of app root code
- imagine distributing revolutionary software via AirDrop
 
- Build a Token Curated Registry adblocker based on the ID of code chunks
	- Market coordinated force against shitty web advertising
 
Other crazy ideas...
- Download dApps from others without Internet
	- ENS => hash of app root code
- imagine distributing revolutionary software via AirDrop
 
- Build a Token Curated Registry adblocker based on the ID of code chunks
	- Market coordinated force against shitty web advertising
 
Other crazy ideas...
- Download dApps from others without Internet
	- ENS => hash of app root code
- imagine distributing revolutionary software via AirDrop
 
- Build a Token Curated Registry adblocker based on the ID of code chunks
	- Market coordinated force against shitty web advertising
 
Current Status (WIP)
feature-complete MVP, for client/server in TypeScript. E2E testing/stats with Puppeteer.


Let's make dApps better and the web a competitive pressure on mobile

TODO:
Need to start tracking bundles over time to collect stats
Questions?
Merkle Bundles
By Liam Zebedee
Merkle Bundles
V2 of this idea, previous slides somewhere else
- 923
 
   
   
  