(updates & plans, DiVOC online session)
Thomas Waldmann (@home, 2022-04-16)
Borg deduplicates based on chunks (not: whole files).
buzhash chunker (content-defined chunking):
rolling hash computed over window
variable size chunks
CPU intensive, no sparse support
fixed chunker (borg 1.2+, fixed size chunks):
cutting a block device into blocks
cutting a LV in LEs
cutting a (fixed record size) DB into records
almost 0 CPU load, sparse file support
A borg repo is a LOG.
(== stuff only gets appended at the end,
old stuff is never modified [only deleted])
A borg repo is a key/value store.
(key: chunk id = MAC(plaintext), value = ciphertext)
Low-level repo operations:
borg < 1.2
implicit segment compaction within write commands
repo writing commands do not compact any more,
there is an explicit "borg compact" command.
use it to free repository space by shuffling entries from non-compact segment files into compact segment files.
does not need crypto key, can be run on repo server.
PUT log entry structure:
Big design issue
One can not check the header only.
To check the crc32, one must read ALL: header+content
Slow if one is only interested in header values.
Correct size is important to seek to next entry.
PUT2 log entry structure:
Can check the header without reading the content.
Better error detection by stronger and super fast xxh64.
crc32 covers header+digest, digest also covers header.
Sometimes slow CRC32 impl. only used for few bytes.
Currently master branch only (backport to 1.2 possible):
ctime and atime support, all ts in ns resolution
could support more metadata, like xattrs, ACLs, ...,
but a lot of work to implement and test
Like PAX plus custom BORG.* PAX headers
for perfect round-tripping
of all borg supported fs item metadata.
Copy archive from repo1 to repo2:
borg export-tar ... repo1::A | borg import-tar ... repo2::A
release master as borg 1.3
release master as borg 2.0
Contributions are welcome!
Code, documentation, review, testing, funding, ...
Just join us on GitHub and LiberaChat IRC #borgbackup.
Donations please via LiberaPay or BountySource:
tw @ waldmann-edv . de
Thomas J Waldmann @ twitter