Borg Backup

 

 

(updates & plans, DiVOC online session)

 

 

 

 

 

Thomas Waldmann (@home, 2022-04-16)

Borg Versions

  • Attic and Borg < 1.0: ancient & buggy,
    hopefully nobody uses that any more
     
  • Borg 1.0: not supported any more
     
  • Borg 1.1: supported and very stable.
    final release pending, after that only critical fixes.
     
  • Borg 1.2: first release done, 1.2.1 coming soon. supported, but rather fresh still.
     
  • Borg master branch (1.3? 2.0?):
    major changes, 1.3.0a1 alpha release just released.
    bleeding edge for testers!

Borg Architecture

1.2+:  fixed chunker

Borg deduplicates based on chunks (not: whole files).

 

buzhash chunker (content-defined chunking):

rolling hash computed over window

variable size chunks

CPU intensive, no sparse support

 

fixed chunker (borg 1.2+, fixed size chunks):

cutting a block device into blocks

cutting a LV in LEs

cutting a (fixed record size) DB into records

almost 0 CPU load, sparse file support

borg repo concepts

A borg repo is a LOG.

(== stuff only gets appended at the end,
old stuff is never modified [only deleted])

A borg repo is a key/value store.

(key: chunk id = MAC(plaintext), value = ciphertext)

Low-level repo operations:

  • PUT (append a new key/value pair)
  • DELETE (register a delete for a previous put by key)
  • COMMIT (finish a transaction, state is valid now)

Segment files

  • contain a sequence of log entries created by repo ops
  • a non-compact segment contains deleted PUT entries

1.2+:  borg compact

borg < 1.2


implicit segment compaction within write commands

 

borg 1.2+

 

repo writing commands do not compact any more,

there is an explicit "borg compact" command.

 

use it to free repository space by shuffling entries from non-compact segment files into compact segment files.

 

does not need crypto key, can be run on repo server.

borg < 1.3:  PUT

PUT log entry structure:

  • crc32 = CRC32(header + content)
  • header: size of entry
  • header: tag (== PUT)
  • header: 256bit key (chunk id)
  • content: value (data)

 

Big design issue

One can not check the header only.

To check the crc32, one must read ALL: header+content

Slow if one is only interested in header values.

Correct size is important to seek to next entry.

borg 1.3+:  PUT2

PUT2 log entry structure:

  • crc32 = CRC32(header + digest)
  • header: size of entry
  • header: tag
  • header: 256bit key (chunk id)
  • digest: xxh64 = XXH64(header + content)
  • content: value (data)

 

Notable

Can check the header without reading the content.

Better error detection by stronger and super fast xxh64.

crc32 covers header+digest, digest also covers header.

Sometimes slow CRC32 impl. only used for few bytes.

Borg < 1.3 crypto

old crypto issues

  • potential nonce reuse and counter measures:
    • AES-CTR mode with 1 AES key per repo
    • counter values (IV / nonce) must never be re-used
    • complex counter management needed
    • limited trust in repository
    • local counter knowledge can be lost (e.g. disk defect)
    • multiple clients need to trust repo
    • to avoid counter issues, repos must use different keys
    • no easy replication of encrypted chunks to other repo
  • self-made layering of
    • AES256-CTR + HMAC-SHA256
    • AES256-CTR + BLAKE2b
  • there are faster ready-to-use AEAD ciphers now

1.3+:  new crypto

new crypto features

  • Fixes potential nonce reuse issue:
    • random session id generated at start of a borg run
    • session key derived from session id and master key
    • counter (IV / Nonce) starts from 0 for each session
    • no counter management needed, no risk of reuse
  • OpenSSL >= 1.1.1 (including on OpenBSD), providing:
    • super fast AES256-OCB (with AES hw acceleration).
      patents first licensed to FOSS, now abandoned.
    • very fast CHACHA20-POLY1305 (pure sw implementation)
  • use AAD of AEAD cipher to protect header / chunkid
  • Argon2 KDF used for the borg key (was: pbkdf2)
  • undecided: maybe adopt BLAKE3 for the chunk id hash

new tar formats

Currently master branch only (backport to 1.2 possible):

--tar-format=PAX

ctime and atime support, all ts in ns resolution

could support more metadata, like xattrs, ACLs, ...,
but a lot of work to implement and test

--tar-format=BORG

Like PAX plus custom BORG.* PAX headers
for perfect round-tripping
of all borg supported fs item metadata.

 

Copy archive from repo1 to repo2:

borg export-tar ... repo1::A | borg import-tar ... repo2::A

borg 2.0?

release master as borg 1.3

  • stay compatible
  • write and test in-place complex upgrade code
  • low space needs for upgrade(?)

 

release master as borg 2.0

  • break compatibility,  issue #6602
  • but also maintain borg 1.2 for a while
  • all is new: new repos, new server, new clients
  • use export-tar/import-tar to transfer archives
  • get rid of all legacy support, simplify code base
  • no in-place upgrade, save developer time for progress
  • less complex, less potential bugs, clean repos

Support the Borg

Contributions are welcome!

 

Code, documentation, review, testing, funding, ...

 

Just join us on GitHub and LiberaChat IRC #borgbackup.

 

 

Donations please via LiberaPay or BountySource:

 

https://www.borgbackup.org/support/fund.html

 

For more information:

borgbackup.org

Questions / Feedback?

  • tw @ waldmann-edv . de

  • Thomas J Waldmann @ twitter

Made with Slides.com