Borg Backup
(a fork of Attic-Backup)
"I found the Holy Grail of backups."
(Stavros K. about Attic-Backup, 8/2013)
34c3 presentation by Thomas Waldmann
Borg - a fork of Attic
-
Attic: 2010-2015, good design, proven code
-
but:
-
development going slowly
-
some bugs and annoyances
-
not very open to new developers
-
-
Borg Backup: forked from Attic in May 2015
-
and:
-
a community project, bus_factor++
-
lots of fixes and good PRs merged
-
open and inviting to new contributors
-
faster paced, lots of activity
-
Feature Set (1)
- easy and fast
- content-defined chunking (*)
- chunk deduplication (*)
- lz4, zstd, zlib, lzma compression
-
encryption with aes256-ctr
-
authentication with
hmac-sha256 or blake2b -
simple backend (k/v, fs, via ssh)
Feature Set (2)
-
FOSS (BSD license)
-
good docs
-
good platform / arch support
-
xattr / acl / bsdflags support
-
mount a backup via FUSE
-
Python 3.4+, a little Cython & C
-
good test coverage, CI
Deduplication (1)
-
No problem with:
- VM images (sparse file support)
- (physical) disks, LV snapshots
- renamed huge directories
- inner duplication of data set
- historical duplication
- duplication on different machines
Deduplication (2)
-
Content defined chunking:
- "buzhash" rolling hash
- cut data when hash has specific bit pattern,
yields chunks with ~ 2^n bytes target size - n + other chunker params configurable
- seeded, to avoid fingerprinting chunk lengths
-
Store chunks under id into store:
- id = HASH(chunk), or
- id = MAC(id_key, chunk)
Now and Future
- 1.0 "oldstable", widely distributed, use 1.0.9+
-
1.1 "stable", recently released, use 1.1.4+
(new features, code cleanup)
-
1.2 Crypto Enhancements
- AES-GCM (AES-OCB? chacha20-poly1305? keccak?)
- Key Management
- Ciphersuite Flexibility
-
1.2 Parallelization
- "Serial Threaded Workers"? (avoids races)
- zeromq?
How you can help
Python / Cython / C? Help us coding.
do a security review
do real-world performance tests / comparisons
find bugs / issues, improve docs
spread the word, borg is not well-known yet
sponsor development via bountysource
help with the windows native port
Borg Backup - Links
github.com/borgbackup
#borgbackup on chat.freenode.net
Questions / Feedback?
Find me at the Python assembly (sometimes).
Or use IRC, github issues or the mailing list.
Bonus: Crypto
- OpenSSL (1.0 or 1.1), but only for the crypto primitives (currently: AES in CTR mode)
- uses hardware acceleration (AES-NI)
- authentication is not hw accelerated:
borg 1.0+: hmac-sha256
borg 1.1+: additionally faster blake2b - borg 1.2 (future): fast AEAD modes
AES-OCB (HW accelerated)
chacha20-poly1305 (quite quick in SW) - crypto hashes from python stdlib / OpenSSL / blake2b reference implementation
- we use random from /dev/urandom (via Py stdlib)
Bonus: Compression
- lz4 is super fast - use it! often faster than without.
- zstd is also cool: offers a wide range from very fast to very good compression (borg 1.1.4+)
- there is also zlib or lzma.
- borg 1.1 can use lz4 to predict compressibility (and then either use none, lz4, zstd, zlib or lzma)
- don't use lzma > level 6, it is pointless: small chunks!
- you can use different compression in same repo.
- existing chunks won't get recompressed.
- 1.1 has "recreate" to recompress.
Bonus: Chunking / Dedupe
- you can use different chunker params in same repo.
- existing chunks won't get re-chunked.
- 1.1 has "recreate" to re-chunk.
- differently cut chunks won't deduplicate.
- deduplication is based on (hmac-)sha256 of chunks' plaintext, before compression / encryption.
Bonus: Hash Table
- own hashtable implementation in C
- lots of chunks to manage, use memory efficiently
- doing rather simple linear hashing
- HT perf determines speed for unchanged files:
we check file mtime|ctime / size / inode number
AND
(via a HT lookup) that we have all chunks in the repo
BorgBackup LT 34c3 (updated 2017-12)
By Thomas Waldmann
BorgBackup LT 34c3 (updated 2017-12)
- 2,105