Merkle Bundles

Make Downloading JavaScript Cool Again

@liamzebedee | liamz.co

 

 

 

 

v2 - evolved from presenting at TBD Hackathon + Web3 Summit Berlin

Context

  • Distributing apps on the web is like sending the bible for every single church reading

    • Bundles > 2MB

  • Recent innovations:

    • Webpack

    • Yarn / NPM united packages

    • React

    • Styled-components

    • PWA’s

  • The future

    • Edge computing

    • ?Blockchains?

Current things

  • Bundle identifiers (2abd79e7d9fg.js) and c-hash-ing

  • SplitChunks

    • Static analysis of dependencies into a graph

    • Split out into many <script src=> and load async

  • Minimization (FooBar -> a.b)

  • Prefetch/preload

  • HTTP2 Push

Problems/Opportunities

Problems:

  • Small changes => more download => less battery life (mobile)

  • Download a whole lot more than I need to

  • We don’t really cache code on webpages, so reloads are always slow

Oppportunity:

  • XKCD (web is instant apps) - AR etc.

  • WebAssembly structured stack machine

  • P2P CDN’s?

 

Task

Deliver code for execution

  • Minimize bandwidth
  • Minimize communications overhead
  • Same/negligible difference in processing power

Ideation

  • What if we downloaded only what changed?

    • How?

  • Naive approach: delta

    • Knowledge of \( C_t \) for \( 0..t \)

    • Problem: How can we synchronise state without high bandwidth usage?

  • We already compare the hashes of assets, but this is for the entire file. Can we do it on a more granular scale

Merkle Trees and AST’s

  • Merkle tree (popularised in blockchain world)

    • A tree data structure

    • Where every non-leaf node is hash of its children

Merkle Trees and AST’s

​Compilers turn code into lower-level code. Example:

 

// Compile JSX (React syntax) to JS (vanilla JavaScript)
//

source = fs.readFile('something.jsx')

// export default const blah = () => {
//    return <div>Hello world</div>;
// }


compiled = runLoaders(src)

// module.exports = function blah() {
//    return React.createElement('div', { children: "Hello world" })
// }

Merkle Trees and AST’s

Compilation:

  • lexed tokens (BRACE_START BRACE_END)
  • abstract syntax tree
  • bytecode

An AST

Merkle AST's

  1. Take the AST of code
  2. Generate a Merkle tree as a compact representation of our AST

 

What does that look like?

 

Merkle AST's - Example

function shout() {
    console.log("BLOCKCHAIN!");
}


// AST
Function [name=shout]
    Call [name=console.log]
        Literal ["BLOCKCHAIN!"]


// Merkle AST
42
    123
        543

Merkle AST's - Example

function shout() {
    console.log("BLOCKCHAIN!");
}


// AST
Function [name=shout]
    Call [name=console.log]
        Literal ["BLOCKCHAIN!"]


// Merkle AST
42
    123
        543
function shout() {
    console.log("BLOCKCHAIN!", "IOT!");
}


// AST
Function [name=shout]
    Call [name=console.log]
        Literal ["BLOCKCHAIN!"]
        Literal ["IOT!"]


// Merkle AST
86
    99
        543
        264

*changed

Merkle AST's - Example

Send client tree to server, compare the Merkle AST's


86*
    99*
        543
        264*

Client

42
    123
        543

Server

86
    99
        543
        264

What's changed?

Merkle AST's - Example

Send client to server, compare the Merkle AST's


86*
    99*
        543
        264*

Client

42
    123
        543

Server

86
    99
        543
        264

What's changed?

Merkle AST's - Example

We send only the diffs to the client

Merkle AST's - Example

We send only the diffs to the client

They patch their source

And run the bundle

The Smart CDN

  • Client only has to send root hash of every <script> / bundle it includes for server to perform AST diff
  • Server only has to keep:
    • 1) the merkle trees of all previous versions
    • 2) the most recent source code in memory

 

Wishlist: deliver binary AST's

Some unintended benefits

  • Code integrity (request by hash)
  • Code deduplication:
    • you include:
      • 1) bootstrap.js, which includes JQuery
      • 2) bundle.js, which includes JQuery
    • any chunks with ID's common in the JQuery trees won't be sent... - only transmit code once
  • Code privacy - merkle AST can be simply a state sync primitive for proprietary code

Vision: A truely evolving platform

Let's cache things inversely proportional to how often they're required

 

e.g. React is loaded almost every day by FB/Netflix, so it never escapes the cache

 

Why? Web is the biggest distributed consensus-driven open platform we have.

 

Let's make mobile web apps INSTANT to load.

Crazier ideas...

  • BitTorrent NPM that never goes down 
  • Download dApps from others without Internet
    • ENS => hash of app root code
    • imagine distributing revolutionary software via AirDrop
  • Build a Token Curated Registry adblocker based on the ID of code chunks
    • Market coordinated force against shitty web advertising

We could...

  • Download dApps from others without Internet
    • ENS => hash of app root code
    • imagine distributing revolutionary software via AirDrop
  • Build a Token Curated Registry adblocker based on the ID of code chunks
    • Market coordinated force against shitty web advertising

Other crazy ideas...

  • Download dApps from others without Internet
    • ENS => hash of app root code
    • imagine distributing revolutionary software via AirDrop
  • Build a Token Curated Registry adblocker based on the ID of code chunks
    • Market coordinated force against shitty web advertising

Other crazy ideas...

  • Download dApps from others without Internet
    • ENS => hash of app root code
    • imagine distributing revolutionary software via AirDrop
  • Build a Token Curated Registry adblocker based on the ID of code chunks
    • Market coordinated force against shitty web advertising

Current Status (WIP)

liamzebedee/merkle-bundles -

feature-complete MVP, for client/server in TypeScript. E2E testing/stats with Puppeteer.

Let's make dApps better and the web a competitive pressure on mobile

TODO:

Need to start tracking bundles over time to collect stats

Questions?

Liam Zebedee

Follow me on Twitter

I love blogging // also coding R&D

liamz.co

Made with Slides.com