Borg Backup
(updates & plans, DiVOC online session)
Thomas Waldmann (@home, 2022-04-16)
Borg Versions
- Attic and Borg < 1.0: ancient & buggy,
hopefully nobody uses that any more
- Borg 1.0: not supported any more
- Borg 1.1: supported and very stable.
final release pending, after that only critical fixes.
- Borg 1.2: first release done, 1.2.1 coming soon. supported, but rather fresh still.
- Borg master branch (1.3? 2.0?):
major changes, 1.3.0a1 alpha release just released.
bleeding edge for testers!
Borg Architecture
1.2+: fixed chunker
Borg deduplicates based on chunks (not: whole files).
buzhash chunker (content-defined chunking):
rolling hash computed over window
variable size chunks
CPU intensive, no sparse support
fixed chunker (borg 1.2+, fixed size chunks):
cutting a block device into blocks
cutting a LV in LEs
cutting a (fixed record size) DB into records
almost 0 CPU load, sparse file support
borg repo concepts
A borg repo is a LOG.
(== stuff only gets appended at the end,
old stuff is never modified [only deleted])
A borg repo is a key/value store.
(key: chunk id = MAC(plaintext), value = ciphertext)
Low-level repo operations:
- PUT (append a new key/value pair)
- DELETE (register a delete for a previous put by key)
- COMMIT (finish a transaction, state is valid now)
Segment files
- contain a sequence of log entries created by repo ops
- a non-compact segment contains deleted PUT entries
1.2+: borg compact
borg < 1.2
implicit segment compaction within write commands
borg 1.2+
repo writing commands do not compact any more,
there is an explicit "borg compact" command.
use it to free repository space by shuffling entries from non-compact segment files into compact segment files.
does not need crypto key, can be run on repo server.
borg < 1.3: PUT
PUT log entry structure:
- crc32 = CRC32(header + content)
- header: size of entry
- header: tag (== PUT)
- header: 256bit key (chunk id)
- content: value (data)
Big design issue
One can not check the header only.
To check the crc32, one must read ALL: header+content
Slow if one is only interested in header values.
Correct size is important to seek to next entry.
borg 1.3+: PUT2
PUT2 log entry structure:
- crc32 = CRC32(header + digest)
- header: size of entry
- header: tag
- header: 256bit key (chunk id)
- digest: xxh64 = XXH64(header + content)
- content: value (data)
Notable
Can check the header without reading the content.
Better error detection by stronger and super fast xxh64.
crc32 covers header+digest, digest also covers header.
Sometimes slow CRC32 impl. only used for few bytes.
Borg < 1.3 crypto
old crypto issues
- potential nonce reuse and counter measures:
- AES-CTR mode with 1 AES key per repo
- counter values (IV / nonce) must never be re-used
- complex counter management needed
- limited trust in repository
- local counter knowledge can be lost (e.g. disk defect)
- multiple clients need to trust repo
- to avoid counter issues, repos must use different keys
- no easy replication of encrypted chunks to other repo
- self-made layering of
- AES256-CTR + HMAC-SHA256
- AES256-CTR + BLAKE2b
- there are faster ready-to-use AEAD ciphers now
1.3+: new crypto
new crypto features
- Fixes potential nonce reuse issue:
- random session id generated at start of a borg run
- session key derived from session id and master key
- counter (IV / Nonce) starts from 0 for each session
- no counter management needed, no risk of reuse
- OpenSSL >= 1.1.1 (including on OpenBSD), providing:
- super fast AES256-OCB (with AES hw acceleration).
patents first licensed to FOSS, now abandoned. - very fast CHACHA20-POLY1305 (pure sw implementation)
- super fast AES256-OCB (with AES hw acceleration).
- use AAD of AEAD cipher to protect header / chunkid
- Argon2 KDF used for the borg key (was: pbkdf2)
- undecided: maybe adopt BLAKE3 for the chunk id hash
new tar formats
Currently master branch only (backport to 1.2 possible):
--tar-format=PAX
ctime and atime support, all ts in ns resolution
could support more metadata, like xattrs, ACLs, ...,
but a lot of work to implement and test
--tar-format=BORG
Like PAX plus custom BORG.* PAX headers
for perfect round-tripping
of all borg supported fs item metadata.
Copy archive from repo1 to repo2:
borg export-tar ... repo1::A | borg import-tar ... repo2::A
borg 2.0?
release master as borg 1.3
- stay compatible
- write and test in-place complex upgrade code
- low space needs for upgrade(?)
release master as borg 2.0
- break compatibility, issue #6602
- but also maintain borg 1.2 for a while
- all is new: new repos, new server, new clients
- use export-tar/import-tar to transfer archives
- get rid of all legacy support, simplify code base
- no in-place upgrade, save developer time for progress
- less complex, less potential bugs, clean repos
Support the Borg
Contributions are welcome!
Code, documentation, review, testing, funding, ...
Just join us on GitHub and LiberaChat IRC #borgbackup.
Donations please via LiberaPay or BountySource:
https://www.borgbackup.org/support/fund.html
For more information:
borgbackup.org
Questions / Feedback?
-
tw @ waldmann-edv . de
-
Thomas J Waldmann @ twitter
BorgBackup - 2022-04 update
By Thomas Waldmann
BorgBackup - 2022-04 update
borgbackup update
- 1,084