+ David Vorick
+ Bitcoin Enthusiast since 2011
+ Bitcoin Researcher since 2013
+ Blockchain Technical Expert
+ Part of the Sia team since 2014
+ Decentralized Data Store
+ Low latency, high throughput uploads and downloads
+ Functioning Prototype Since June 2015
+ 3 full time devs, active community
+ Data owned by 1 company
+ Often unencrypted, unauthenticated
+ Often in 1 legal jurisdiction
+ Profit motives may not align with the consumer
+ Gated ecosystems can inflate prices, hurt compatibility
+ Give control to the owner - the owner should know that data is safe, available, and private
+ Eliminate trust - the owner of the data should not need to rely on anyone to guarantee the security of the data
+ Spread out power within the ecosystem - when a single party controls too much, they can dictate the industry on their terms, often to the disadvantage of everyone else
+ A database
+ A database with specific rules on updates
+ A database with a specific ordering for updates
+ A database where specific ordering is enforced without a central party
+ Some people have money
+ Money can be transferred from one person to another
+ Money cannot be duplicated - once money has been transferred from one person to another, the original owner no longer has the money
+ 'Alice' has $10
+ 'Alice' sends $10 to 'Bob'
+ 'Alice' sends $10 to 'Charlie'
+ Who has the money now, Alice, Bob, or Charlie? For the system to work, everybody must have the same answer to this question.
+ When looking at multiple histories, everyone agrees that the history with the most work is the valid history
+ Work is easy to verify, easy to see which history has the most work
+ To alter history, more work needs to be placed on the alternate history than is available on the currently accepted history
+ Miners get paid to extend the longest history
+ Miners will not get paid for work that extends an alternate history unless it becomes the longest history
+ Miners are incentivized to work on the longest known history because they know all other miners will be working on the longest known history
+ Anybody can verify the longest history
+ Anybody can verify that alternate histories don't have as much work
+ The only requirement for knowing the truth as accepted by everyone else is having the longest history
+ In Bitcoin, all full nodes keep a full copy of the longest history, and upload it freely to anyone requesting it
+ Sia would like to bring this same trustless verification to cloud storage
+ The ultimate goal is to know that your data is safe, and to know that there's nothing anybody can do to harm your data or prevent you from accessing it
+ Cloud storage means someone else controls the data, by definition
+ The host can unplug or delete data
+ Host can share data without permission
+ No way to guarantee data safety.
+ Despite limitations, Sia is powerful
+ Data given to many hosts
+ Encrypted and authenticated
+ Many simultaneous legal jurisdictions
+ Incentives align with the consumer
+ Open marketplace drives prices down, encourages innovation
+ Any host should be able to offer storage to the network
+ This includes potentially untrustworthy hosts
+ Cryptography, erasure coding, and smart contracts allow us to trust that data is safe, even if we don't trust the host
+ The core of the host ecosystem is the file contract
+ The file contract is essentially escrow for storage payments - host is guaranteed to get paid, but only if they can prove that they held the data
+ Bad hosts don't get paid
+ Renter and host both put money into a file contract
+ File contract contains a Merkle root of the file, along with the size of the file
+ File contract contains a duration. At the end of the duration, the host must provide a proof-of-storage to the blockchain to get the money in the contract
+ A random 64 byte segment is chosen
+ Host must upload segment to the blockchain, along with a Merkle tree proof.
+ Random number seeded by a block ID after the contract duration ends (block IDs are expensive/difficult to manipulate)
+ Host has proven storage for 1 random segment
+ No way to predict which segment
+ Cheating has negative expected value due to the host and renter both adding money to the file contract
+ There are a bunch of hosts competing for storage contracts
+ Storage contracts force hosts to keep data - there is financial penalty for any attempt to cheat
+ The renter can leverage many hosts to create a safe file upload
+ Use Reed-Solomon erasure coding to upload to many hosts
+ Assuming 95% reliable hosts, '7 of 21' provides 99.999999999% reliability, 3x overhead
+ Using lots of hosts means high parallelism for downloads - lots of throughput, though latency may be affected
+ Increasing the number of hosts means increasing file reliability
+ With 100 hosts, 95% uptime per host, independent failures, and 1.2x redundancy, 99.9999999999% uptime can be achieved
+ The benefits from the network effects are huge!
+ Uploading scheme is fully customizable - any erasure scheme is allowed
+ Uploading can be set to optimize for cost, for uptime, for throughput
+ Sia's global network of hosts makes a great foundation for a CDN
+ Host is financially incentivized to keep the data. Encryption protects the data
+ Host is not necessarily required to upload the data upon request - data can be held hostage
+ Hosts may be very slow, even if honest
+ High redundancy + parallelism is used, many slow hosts can collectively still provide high throughput
+ Hosts attempting to hold data hostage can be ignored as long as there's a full copy spread among the non-malicious hosts
+ Hosts that get ignored lose out on bandwidth revenue
+ Hosts can have lots of downtime
+ Hosts can have slow speeds
+ Hosts can have high latency
+ Hosts may be Glacier-like in retrieval time
+ Hosts may execute Sybil attacks
+ A Sybil attack is when one person pretends to be many
+ A single host can pretend to be 100,000 hosts, enabling them to get all of a target's data
+ Hostage attacks are now possible
+ Need some way to tell what type of storage / service a host can provide
+ Don't want to trust the host, or any network of unknowns
+ Reputation system needs to handle Sybil attacks as well
+ Biggest challenge in Sia is to determine reliability of unknown / untrusted hosts
+ Renter tracks host uptime, frequently challenges host to do off-chain storage proofs.
+ Renter measures latency, throughput, and other relevant metrics
+ Creates a fully trustless, direct observation based reputation system
+ Trust a third party to track all host metrics
+ Third party can be more diligent
+ Host may learn to cheat / prioritize the third party
+ Requires trusting the third party
+ Keep internal observations
+ Get observations from multiple third parties
+ Cross reference third party observations against eachother and against internal observations
+ More complex, but a stronger solution
+ Sybil attacks can fool third parties too
+ Sybil attacker may be the third party
+ Use Proof-of-Burn to make Sybil attacks expensive
+ Provably destroy wealth, in this case, coins
+ Linear relationship between credibility and volume destroyed
+ Hosts that do not destroy coins are not viewed as credible
+ Sybil attacks become very expensive
+ Proving credibility costs money
+ The wealthy have an advantage
+ The wealthy however likely have more and better storage, so should have an advantage
+ Long term reputation will correct for imbalances
+ Rely on other forms of identity
+ Can use government IDs, web-of-trust, or other identity systems that exist
+ Most non-burn methods of identity can be cheated, or are centralized
+ Decentralized identity is largely an unsolved problem
+ Cannot eliminate attack, but can use multiple methods to make attacks expensive
+ At some point, attacks become not worthwhile
+ Finding the optimal set of defences is an ongoing effort
+ What we have now is already pretty good
+ One file contract per renter-host relationship
+ Unlimited amount of data per file contract
+ Sia currently supports about 50 million file contracts per year
+ Most users will need between 20 and 200 file contracts per year.
+ Scale is therefore between 250,000 and 2.5 million users.
+ Potential improvements on horizon promise 5x - 100x improvements
+ Still not great.
+ As scaling limits are reached, transaction fees will be used to determine who can access the blockchain
+ Relationships that use lots of blockchain space will have less overhead
+ More favorable to enterprises than to consumers - 2.5 million enterprises is a LOT!
+ Decentralized ecosystem encourages stability and collaboration - breaking compatibility is extremely difficult by design
+ Many developers already building apps on top of Sia, and Sia is still a prototype
+ A competitive marketplace of hosts keeps prices at their absolute minimum
+ Renters with wide diversity of needs and geographic locations keeps the ecosystem spread out
+ Reputation system means quality of service is emphasized over branding
Sia brings a revolutionary change to the cloud storage industry. Sia eliminates trust in a single, central source, spreads out data geographically, drives prices down, and successfully navigates adversarial conditions despite challenges
Thanks
Deck is safe for reuse - best effort has been made to include only images that are 'labed for reuse'