Venti: a new approach to archival storage
Contents
-
Introduction & background
-
The Venti Archival Server
-
Applications
-
Implementation
-
Performance
-
Reliability & recovery
-
Related & future work
- Conclusion & Critic
Introduction & background
Introduction & background
Archival storage is a second class functionality for current computer environments.
The storage capacity exceeds the ability of many users to generate data, making it practical to archive data in perpetuity.
Write-once policy
Introduction & background
A prevalent form of archival storage is magnetic tape.
Restoring data from a tape can be tedious and error prone.
A trade off exist between performance of backup and restore operations.
Introduction & background
Snapshots avoid the trade off between full and incremental backups.
The Venti Archival Server
The Venti Archival Server
Venti is a block-level network storage system.
It identifies data blocks by a hash of their contents.
Write-once, data replication is idempotent.
The Venti Archival Server
Though, magnetic disk storage is not as stable or permanent as optical media.
Using magnetic disks for Venti has the benefit of reducing the disparity in performance between conventional and archival storage.
Applications
Applications
Applications use the block level service provided by Venti to store more complex data structures.
Data is divided into blocks and written to the server. To enable this data to be retrieved, the application must record the fingerprints into additional blocks.
Vac is an application for storing a collection of files and directories as a single object.
VAC
An important attribute of vac is that it writes each file as a separate collection of Venti blocks, thus ensuring that duplicate copies of a file will be coalesced on the server.
Vac also implements an incremental option based on the file modification times.
Applications
In this alternative, the disk blocks that make up the file system are directly copied without interpretation. This enables simplicity and potentially much higher throughput.
Physical backup
The simplest form of physical backup is to copy the raw contents of the disk drives to Venti. Main advantage: coalescing duplicate blocks.
Applications
The new version of Plan 9 uses Venti instead of an optical jukebox. This equalizes access to active and archival view of the file system. It also allows the cache to be quite small.
Plan 9 file system
Applications
Implementation
Implementation
The implementation uses an append-only log of data blocks and an index that maps fingerprints to locations in this log. One main goal of the prototype is robustness.
Storage of data blocks is separated from the indexes used to locate them. In particular, blocks are stored in an append-only log on a RAID array of disk drives.
To ease maintenance, the log is divided into self-contained structures called arenas. Each arena contains a large number of data blocks and is sized to facilitate operations such as copying to removable media.
Data blocks are variable sized up to a current limit of 52 Kb. Each block is prefixed with a header that describes the contents of the block. The header provides integrity checking.
Implementation
Implementation
Client
Client
Client
Network
Block Cache
Index Cache
Index
Implementation
data log
arena
Data blocks
header
Directory
Trailer
magic |
---|
fingerprint |
TYPE |
size |
user |
wtime |
encoding |
esize |
Implementation
Performance
Performance
The uncached sequential read performance is
particularly bad. The problem is that these sequential reads require a random read of the index.
One possible solution is a form of read-ahead.When reading a block from the data log, it is feasible to
also read several following blocks. These extra blocks
can be added to the caches without referencing the
index.
Performance
Performance
Performance
Reliability & Recovery
Reliability & Recovery
Integrity checking and error recovery is of fundamental importance. There are several tools implemented along with Venti to achieve this: verifying the structure of the arena, checking a one on one relation between data blocks and entries in the data log and copying an arena to removable media.
Reliability & Recovery
There is also a type identifier associated with each block, this integer is included in every write or read operation and has the effect of partitioning the server into multiple independent domains.
Related & future work
Related & Future work
There are several systems similar to Venti. For example the Stanford Archival Vault, that unlike Venti, it has no way to share data between objects that are partially the same. Another system is the Read-Only Secure File System though, the focus of this system is security and not archival storage. Finally the Elephant file system, could incorporate Venti as the storage device for the permanent versions of files.
Related & Future work
- Venti could be distributed across multiple machines
- Venti provides little security
- Similarities on files (what if a data block shifts)
Conclusion & Critic
Conclution & Critic
- The use of disk technologies overly complicates random reads and writes, maybe solid state technologies should be explored.
- The performance penalty of archiving new data blocks is too large for a system whose logic is perpetual archival of information.
- The authors say that performance is not as good as they expected, but that the results look promising, whereas, they do not abound in the reasons why they say so.
- It is not easy to implement on any given application, the application must be suitable for Venti's properties and behavior.
- Full index checkup implementation is frankly naïve.
Thank you.
deck
By Luis Roman
deck
- 1,367