Decentralizing the Cloud
Who Am I?
+ David Vorick
+ Bitcoin Enthusiast since 2011
+ Bitcoin Researcher since 2013
+ Part of the Sia team since 2014
What is Sia?
+ Decentralized Data Store
+ Low latency, high throughput uploads and downloads
+ Functioning Prototype Since June 2015
+ 3 full time devs, active community
Modern Cloud Storage
+ Data owned by 1 company
+ Often unencrypted, unauthenticated
+ Often in 1 legal jurisdiction
+ Profit motives may not align with the consumer
+ Gated ecosystems can inflate prices
Goals of Decentralization
+ Give control to the owner - the owner of a Bitcoin controls it's destiny.
+ Eliminate trust - except for 51% attacks, nobody can spend a Bitcoin but the owner, period. Fixed inflation schedule, etc.
+ Spread out power within the ecosystem - no party has control without consent from everyone else
Limitations of the Cloud
+ Cloud storage means someone else controls the data, by definition
+ The Host can unplug or delete data
+ Host can share data without permission
+ No way to guarantee data safety.
Decentralization with Sia
+ Data given to many hosts
+ Encrypted and authenticated
+ Many simultaneous legal jurisdictions
+ Incentives align with the consumer
+ Open marketplace drives prices down, encourages innovation
File Contracts
+ Renter and host both put money into a file contract
+ File contract contains a Merkle root of the file, along with the size of the file
+ File contract contains a duration. At the end of the duration, the host must provide a proof-of-storage to the blockchain to get the money in the contract
The Storage Proof
+ A random 64 byte segment is chosen
+ Host must upload segment to the blockchain, along with a Merkle tree proof.
+ Random number seeded by a block ID after the contract duration ends (block IDs are expensive/difficult to manipulate)
Storage Proof 2
+ Host has proven storage for 1 random segment
+ No way to predict which segment
+ Cheating has negative expected value due to the host and renter both adding money to the file contract
Uploading Strategy
+ Reed Solomon Codes - M of N
+ Assuming 95% reliable hosts, '7 of 21' provides 99.999999999% reliability, 3x overhead
+ Redundancy also provides protection against withholding/hostage attacks - if some hosts are holding data hostage, use the ones that aren't.
Host Selection Problem
+ A malicious actor could perform a Sybil attack, imitating millions of hosts
+ Attacker could provide very low prices, but only for a selected target (increasing chance of selection)
+ Other selection manipulations possible
Host Selection Solutions
+ Renter tracks host uptime, frequently challenges host to do off-chain storage proofs.
+ Proof-of-Burn is used to combat Sybil attacks - hosts burn % of revenue to demonstrate legitimacy
+ Weakest part of Sia protocol.
Optional Centralization
+ Third party agencies can monitor and certify hosts.
+ Renters can choose to trust agencies when trying to pick hosts
+ If there are a large number of agencies, bad agencies can be easily ignored
Scalability
+ Use of File Contract Revisions (payment channels for files)
+ Despite revision channels, scale limited to 50 million file contracts per year.
+ Unlimited amount of data per file contract.
Scalability 2
+ Most users will need between 20 and 200 file contracts per year.
+ Scale is therefore between 250,000 and 2.5 million users.
+ Potential improvements on horizon promise 5x - 100x improvements
+ Still not great.
Ecosystem
+ Decentralized ecosystem encourages collaborative app development
+ Open market encourages aggressive competition - similar to Bitcoin mining
+ Cheap, secure data has utility even if you don't care about decentralization
Conclusion
Despite imperfections, we can do substantially better than centralized cloud storage, in a way that has moderate scalability, especially at the enterprise level.
Thanks
Q & A
Deck is safe for reuse - best effort has been made to include only images that are 'labed for reuse'
Decentralizing the Cloud
By David Vorick
Decentralizing the Cloud
- 1,111