Blog: http://blog.codenova.pl/
Twitter: @Kamil_Mrzyglod
GitHub: https://github.com/kamil-mrzyglod
StackOverflow: https://stackoverflow.com/users/1874991/kamo
LinkedIn: www.linkedin.com/in/kamil-mrzygłód-31470376
Cloud storage system for storing limitless amounts of data for any duration of time
Data stored durably using both local and geographic replication
Blobs, tables, queues
In production since 2008
Strong consistency
Global and Scalable Namespace/Storage
Disaster recovery
Multi-tenancy and cost
Tables
Blobs
350TB of data
40k transactions /sec
3B transactions / day
Queues
http(s)://AccountName.<service>.core.windows.net/PartitionNa me/ObjectName
DNS
Cluster
Individual object within partition
Node management
Network configuration
Health monitoring
Starting/stopping services
Service deployment
Stamp 1
Stamp 2
Stamp 3
Data Center
A cluster of N racks
Each rack built out as a separate fault domain
Typically from 10 to 20 racks, 18 disk-heavy storage nodes per rack
Holds from 2PBs to 30PBs of data
Utilization ~70%
When reaches 70% of utilization, inter-stamp replication starts
Location Service
Manages stamps
Manages account namespace
Chooses the primary stamp
Updates DNS to route from a URL to stamp's virtual IP
Block
Extent
Stream
Stream Manager
Extent Node
Object Table
Range Partition
Range Partition
Range Partition
Partition Server
Partition Server
Partition Server
Account Table
Stores metadata and configuration for each account assigned to a stamp
Blob Table
Stores all blobs for all accounts within a stamp
Entity Table
Stores all entity rows for all accounts within a stamp
Message Table
Stores all message for all queues
Schema Table
Keeps track of the schema of all OTs
Partition Table
Keeps track of the current Range Partitions for all OTs and what Partition Server is serving each Range Partition
Partition Manager
Load Balance
Split
Merge
Too much traffic on a PS, re-assign RangePartitions to less loaded PSs
RangePartition has too much load, split it into smaller ones and load balance across different PSs
Merge cold or lightly loaded Range Partitions
Partition Manager
Partition Server
Lock Service
Metadata Stream
Commit Log Stream
Raw Data Stream
Blob Data Stream
Memory table
Index cache
Raw Data cache
occurs, both the row data cache and the memory table are checked, giving preference to the memory table
Bloom Filters
may be in the checkpoint
http://sigops.org/sosp/sosp11/current/2011-Cascais/printable/11-calder.pdf
https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction