(a fork of Attic)
"I found the Holy Grail of backups."
(Stavros K. about Attic-Backup, 8/2013)
Thomas Waldmann (PyCon DE, 2017-10-25)
It's a backup tool
-
one you maybe actually would enjoy using.
ssh transport for remote repos
append-only mode repos
FOSS, you can see the code
$ borg info ssh://borg@myserver/repos/myrepo
Original size Compressed size Dedup size
All archives: 22.76 TB 18.22 TB 486.20 GB
Unique chunks Total chunks
Chunk index: 6305006 272643223
Real stats from a real backup repository (shortened).
2 machines, 147 backup archives, 2.5 years.
borg does error (and even tampering) detection
but not (yet?) error correction
kinds of errors / threat model:
single/few bit errors
defect / unreadable blocks
media failure (defect disk, ssd)
see issue #225 for discussion
implement something in borg?
rely on other soft- or hardware solutions?
avoid futile attempts, borg is application level
sha256, hmac-sha256 is slow
solved: borg 1.1 added blake2b
zlib crc32 is slow
solved: borg 1.1 added fast crc32 C code
AES-CTR + MAC 2-pass AE can be slow
todo: borg 1.2 will use OpenSSL 1.1 for:
AES-OCB (very fast, if hw accelerated)
chacha2-poly1305 (quite fast w/o hw accel.)
key / cipher agility
currently:
1 AES key
1 MAC key
1 chunker seed
stored highest IV value for AES CTR mode
encrypted using key passphrase
borg >= 1.0 now has lower RAM consumption
uses bigger chunks (2MiB, was: 64kiB)
chunks, files and repo index kept in memory
less chunks to manage -> smaller chunks index.
be careful on small machines (NAS, raspi, ...)
or with huge amount of data / huge file count
in the docs, there is a formula to estimate RAM usage
own hash table implementation in C
compact block of memory, no pyobj overhead
e.g. used for the chunks index, repo index
uses closed hashing (bucket array, no linked lists)
uses linear probing for collision handling
HT performance difficult to measure
problem: multiple clients updating same repo
then: chunk index needs to get re-synced
slow, esp. if remote, many and/or big archives
local collection of single-archive chunk indexes
needs lots of space, merging still expensive
idea: "borgception"
backup chunks index into a secondary borg repo
fetch it from there when out of sync
idea: "build chunks index from repo index" (in 1.1)
repo index knows all chunk IDs
but: no size/csize info in repo index
1.2.3 (tagged release code)
1.2.4.dev3+gdeadbee (3 commits later)
test scalability / reliability / security
find, file and fix bugs
file and implement feature requests
improve docs
contribute or review code
spread the word
create dist packages
care for misc. platforms (windows)
donate funds via bountysource
Just grab me at the conference or at the sprints!
Thomas J Waldmann @ twitter