IPFS is the new HTTP 

presented by:

[Hamid Salehian]

How Internet works (HTTP)

Development of HTTP was initiated by Tim Berners-Lee at CERN in 1989.

wikipedia

What's wrong with HTTP

  • The web's centralization limits opportunity
  • HTTP is inefficient and expensive
  • Humanity's history is deleted daily
  • Our apps are addicted to the backbone

 All trivial compared to interplanetary networking.

  • Developing world
  • Offline
  • Natural disasters
  • Intermittent connections 

What's wrong with HTTP

Me: 1 X 6 X 200 = 1.2G

200 X (7.1 Bilion Views) = 1400 Petabyte

What's wrong with HTTP

Iran Internet Lockdown

What's wrong with HTTP

What's the solution

IPFS: Inter-Planetary File System

Distributed

Permanent

Merkle

THE

WEB

IPFS (Juan Benet) =

BitTorrent (Bram Cohen) + Git (Linus Torvalds) + Kademlia (Petar Maymounkov, David Mazières)

IPFS was launched in an alpha version in February 2015,

What is IPFS?

IPFS is a protocol:

  • Defines a content-addressed file system
  • Coordinates content delivery
  • Combines Kademlia + BitTorrent + Git

IPFS is a file system:

  • Has directories and files
  • Is a mountable filesystem (via FUSE)

IPFS is a web:

  • Can be used to view documents like the conventional web
  • Files are accessible via HTTP at https://ipfs.io/<path>
  • Browsers and extensions can learn to use the ipfs:// URL or dweb:/ipfs/ URI schemes directly
  • Hash-addressed content guarantees authenticity

IPFS is modular:

  • Connection layer over any network protocol
  • Routing layer
  • Uses a routing layer DHT (Kademlia/Coral)
  • Uses a path-based naming service
  • Uses a BitTorrent-inspired block exchange

IPFS uses crypto:

  • Cryptographic-hash content addressing
  • Block-level deduplication
  • File integrity plus versioning
  • File-system-level encryption plus signing support

IPFS is p2p:

  • Worldwide peer-to-peer file transfers
  • Completely decentralized architecture
  • No central point of failure

IPFS is a CDN:

  • Add a file to the file system locally, and it's now available to the world
  • Caching-friendly (content-hash naming)
  • BitTorrent-based bandwidth distribution

IPFS has a name service:

  • IPNS, an SFS-inspired name system
  • Global namespace based on PKI
  • It serves to build trust chains
  • It's compatible with other NSes
  • Can map DNS, .onion, .bit, etc to IPNS

How it works?

IPFS Stack

 

  1. Identities - manage node identity generation and verification.
  2. Network - manages connections to other peers, uses various underlying network protocols. Configurable.
  3. Routing - maintains information to locate specific peers and objects. Responds to both local and remote queries. Defaults to a DHT, but is swappable.
  4. Exchange - a novel block exchange protocol (BitSwap) that governs efficient block distribution. Modeled as a market, weakly incentivizes data replication. Trade Strategies swappable.
  5. Objects - a Merkle DAG of content-addressed immutable objects with links. Used to represent arbitrary data structures, e.g. file hierarchies and communication systems.
  6. Files - versioned file system hierarchy inspired by Git.
  7. Naming - A self-certifying mutable name system.

Location Addressing

where is the data

https://apod.nasa.gov/apod/image/2101/Chandrafirstlight_0.jpg

                        https://129.164.179.22/apod/image/2101/Chandrafirstlight_0.jpg

129.164.179.22

who has the data?

Content Addressing


7PLGEr7+dNEnooD9P8hQEAD+11KN8bkNhfbTTqtDT8Skh1mvuSzOHxL3edynRBR2qOmzMdVakVEa13ZiOO5vBQxXPtKHEF2dXfAoQe53eaKgmTDTXhexwA9mBBL1ZFeSzekZ5wWRJDbv0IwY4Lt1ijn0F619W/zrjcmfCWsIBss=
 

Qmbnu2zADtcd3zfV2mEDc66i39g6zJG9d4i7jEgstxhgCs

image

binary

digest

CID : content identifier

ipfs://qmbnu2zADtcd3zfV2mEDc66i39g6zJG9d4i7jEgstxhgCs

https://ipfs.io/ipfs/Qmbnu2zADtcd3zfV2mEDc66i39g6zJG9d4i7jEgstxhgCs

Content Addressing

Identities: name those nodes

each node:

  • generate a PKI key pair
  • hash the public key
  • the resulting hash is the NodeId

2 nodes start communicating:

  • exchange public keys
  • check if: hash(other.PublicKey) == other.NodeId
  • if so, we have identified the other node and can e.g. request for data objects
  • if not, we disconnect from the "fake" node

Network: talk to other clients

  • TCP
  • CJDNS
  • UTP
  • WebRTC
  • QUIC
  • I2P
  • TOR
  • WebSocket

libp2p is the product of a long, and arduous quest of understanding -- a deep dive into the internet's network stack, and plentiful peer-to-peer protocols from the past. Building large-scale peer-to-peer systems has been complex and difficult in the last 15 years, and libp2p is a way to fix that. It is a "network stack" -- a protocol suite -- that cleanly separates concerns, and enables sophisticated applications to only use the protocols they absolutely need, without giving up interoperability and upgradeability. libp2p grew out of IPFS, but it is built so that lots of people can use it, for lots of different projects.

 

Routing: announce and find stuff

  • announce that this node has some data (a block as discussed in the next chapter), or
  • find which nodes have some specific data (by referring to the multihash of a block), and
  • if the data is small enough (=< 1KB) the DHT stores the data as its value.

The routing layer is based on a DHT

DHT: Distributed Hash Table

  • distributed system that provides a lookup service similar to a hash table: key-value pairs are stored in a DHT

  • any participating node can efficiently retrieve the value associated with a given key.

  • removed with minimum workaround re-distributing keys.

  • keys are unique identifiers that map to particular values, which in turn can be anything from addresses to documents, to arbitrary data.

When searching for some value, the algorithm needs to know the associated key and explores the network in several steps. Each step will find nodes that are closer to the key until the contacted node returns the value or no more closer nodes are found. This is very efficient: Like many other DHTs, Kademlia contacts only O(log(n)) nodes during the search out of a total of n nodes in the system.

 

wikipedia

KaMelia

Used by: Gnutella and BitTorrent, Anycast

Exchange: give and take

Similarity:
• exchange of data (blocks) in IPFS is inspired by BitTorrent
• tit-for-tat strategy (if you don’t share, you won’t get)
• get rare pieces first
Difference:
• separate swarm for each file in BitTorrent, one swarm for all (BitSwap in IPFS)

IPFS: BitSwap

BitTorrent

Objects: organize the data

  and FILE

  • Immutable objects represent Files (blob), Directories (tree), and Changes (commit).
  • Objects are content-addressed, by the cryptographic hash of their contents.
  • Links to other objects are embedded, forming a Merkle DAG. This provides many useful integrity and workflow properties.
  • Most versioning metadata (branches, tags, etc.) are simply pointer references, and thus inexpensive to create and update.
  • Version changes only update references or add objects.
  • Distributing version changes to other users is simply transferring objects and updating remote references.

Merkel tree

A Merkle tree is a binary tree where the parent contains the hash of the concatenation of the hashes of the two children. This explains the integrity property: any change in a data block results in a change of the root node. With just a little bit of meta-data (uncles and parents, which can be untrusted) and a trusted root node, we can verify the validity of the block.

 

IPLD (objects) consist of 2 components:

  • Data — blob of unstructured binary data of size < 256 kB.

  • Links — array of Link structures. These are links to other IPFS objects.

Every IPLD Link has 3 parts:

 

  • Name — name of the Link
  • Hash — hash of the linked IPFS object

  • Size — the cumulative size of linked IPFS object, including following its links

IPLD: InterPlanetary Linked Data

Tim Berners-Lee has been working on for ages, and his new company, Solid, is building a business around it.

addressing remote filesystems using the following scheme

/sfs/<Location>:<HostID>

where Location is the server network address, and:

HostID = hash(public_key || Location)

Naming: adding mutability

SFS: Self-Certified FileSystems

 

  • The root address of a node is /ipns/<NodeId>
  • The content it points to can be changed by publishing an IPFS object to this address
  • By publishing, the owner of the node (the person who knows the secret key that was generated with ipfs init) cryptographically signs this "pointer".
  • This enables other users to verify the authenticity of the object published by the owner.
  • Just like IPFS paths, IPNS paths also start with a hash, followed by a Unix-like path.
  • IPNS records are announced and resolved via the DHT.

IPNS: InterPlanetary Name SERVER

  • IPFS is content-addressable. The data on IPFS is identified using CIDs.
  • These CIDs are unique with respect to the data referenced by it.
  • IPFS uses hash functions for its tamper-proof property which makes IPFS a Self-certifying File System.
  • IPFS uses Multihash, which allows it to have different versions of CIDs for the same data(this doesn’t mean that CIDs are not unique. If we use the same hash function, then we will have same CID. We will talk more about this in Part 4 of this series).
  • IPFS uses IPLD to manage and link all the chunks of data.
  • IPLD uses Merkle DAG(aka Directed Acyclic Graph) data structure to link the chunks of data.
  • IPLD also adds De-duplication property to IPFS.
  • IPFS uses IPNS for linking CIDs to a fixed IPNS link, which is analogous to today DNS of the centralized Internet.
  • IPFS uses Libp2p to communicate data and discovering other peers(computers and smartphones) on the IPFS network which can significantly improve the speed of your net surfing.

SUMMARY

Lets Get's Hand Dirty

Questions?

Reference

  • https://github.com/ipld/ipld#implementations
  • https://decentralized.blog/understanding-the-ipfs-white-paper-part-1/2.html
  • https://hackernoon.com/understanding-ipfs-in-depth-1-5-a-beginner-to-advanced-guide-e937675a8c8a
  • https://medium.com/coinmonks/understanding-ipfs-in-depth-5-6-what-is-libp2p-f8bf7724d452

TnaNks

IPFS it the New HTTP

By Hamid Salehian

IPFS it the New HTTP

Brief introfuction to IPFS

  • 160