(updates & plans, DiVOC online session)
Thomas Waldmann (@home, 2022-04-16)
Borg deduplicates based on chunks (not: whole files).
buzhash chunker (content-defined chunking):
rolling hash computed over window
variable size chunks
CPU intensive, no sparse support
fixed chunker (borg 1.2+, fixed size chunks):
cutting a block device into blocks
cutting a LV in LEs
cutting a (fixed record size) DB into records
almost 0 CPU load, sparse file support
A borg repo is a LOG.
(== stuff only gets appended at the end,
old stuff is never modified [only deleted])
A borg repo is a key/value store.
(key: chunk id = MAC(plaintext), value = ciphertext)
Low-level repo operations:
Segment files
borg < 1.2
implicit segment compaction within write commands
borg 1.2+
repo writing commands do not compact any more,
there is an explicit "borg compact" command.
use it to free repository space by shuffling entries from non-compact segment files into compact segment files.
does not need crypto key, can be run on repo server.
PUT log entry structure:
Big design issue
One can not check the header only.
To check the crc32, one must read ALL: header+content
Slow if one is only interested in header values.
Correct size is important to seek to next entry.
PUT2 log entry structure:
Notable
Can check the header without reading the content.
Better error detection by stronger and super fast xxh64.
crc32 covers header+digest, digest also covers header.
Sometimes slow CRC32 impl. only used for few bytes.
Currently master branch only (backport to 1.2 possible):
--tar-format=PAX
ctime and atime support, all ts in ns resolution
could support more metadata, like xattrs, ACLs, ...,
but a lot of work to implement and test
--tar-format=BORG
Like PAX plus custom BORG.* PAX headers
for perfect round-tripping
of all borg supported fs item metadata.
Copy archive from repo1 to repo2:
borg export-tar ... repo1::A | borg import-tar ... repo2::A
release master as borg 1.3
release master as borg 2.0
Contributions are welcome!
Code, documentation, review, testing, funding, ...
Just join us on GitHub and LiberaChat IRC #borgbackup.
Donations please via LiberaPay or BountySource:
https://www.borgbackup.org/support/fund.html
tw @ waldmann-edv . de
Thomas J Waldmann @ twitter