Borg Backup



(2.0 alpha, updates & plans)






Thomas Waldmann (@home, 2022-07)

Borg Versions

  • Borg 1.1:
    • supported and very stable.
    • final release 1.1.18, after that only critical fixes.
  • Borg 1.2:
    • first releases done: 1.2.0, 1.2.1
  • Borg 2.0:
    • borg2 branch, will eventually get into master
    • 2.0.0a2 alpha release is out.
    • breaking changes, bleeding edge for testers!

2.0 = breaking!

  • break compatibility,  issue #6602
  • do all breaking changes in one release
  • all is new: new repos, new server, new clients
  • cli syntax cleanup
  • get rid of all legacy
  • simplify code base
  • solve some fundamental issues (pending since 7y)
  • get rid of troublemakers, stuff that blocks progress
  • no in-place repo upgrade:
    • saves developer time
    • less complex, less potential bugs, clean repos
  • but instead: new borg archive transfer command

CLI: repos + archives

  • no scp style repos any more:  user@host:path
    • no port possible
    • parser sometimes confused this with local path
  • ssh URL is better:  ssh://user@host:port/path
    • a port is possible here
    • easy to parse / disambiguate
  • archive name separate from repo, no "::" any more
    • borg -r REPO diff ARCH1 ARCH2
      • fixed amount of archs: positional params
    • borg -r REPO delete -a ARCH_GLOB --first 3
      • ​​some?:  -a 'crap-*'
      • want just one specific?:  -a only_this_one

repo vs. arch cmds

  • commands work either on repo or on archive(s)
    • borg rcreate = "repo create"  (was: borg init)
    • borg create = "archive create"
    • borg rlist = "repo list"
    • borg list = "archive list"
    • borg delete = "archive delete"
    • borg rdelete = "repo delete"
  • exception: borg check syntax is unchanged
    • works on repo and archives by default
    • --repository-only
    • --archives-only

borg repo concepts

A borg repo is a LOG.

(== stuff only gets appended at the end,
old stuff is never modified [only deleted])

A borg repo is a key/value store.

(key: chunk id = MAC(plaintext), value = ciphertext)

Low-level repo operations:

  • PUT (append a new key/value pair)
  • DELETE (register a delete for a previous put by key)
  • COMMIT (finish a transaction, state is valid now)

Segment files

  • contain a sequence of log entries created by repo ops
  • a non-compact segment contains deleted PUT entries

borg < 2.0:  PUT

PUT log entry structure:

  • crc32 = CRC32(header + content)
  • header: size of entry
  • header: tag (== PUT)
  • header: 256bit key (chunk id)
  • content: value (data)


Big design issue

One can not check the header only.

To check the crc32, one must read ALL: header+content

Slow if one is only interested in header values.

Correct size is important to seek to next entry.

borg2:  PUT2

PUT2 log entry structure:

  • crc32 = CRC32(header + digest)
  • header: size of entry
  • header: tag
  • header: 256bit key (chunk id)
  • digest: xxh64 = XXH64(header + content)
  • content: value (data)



Can check the header without reading the content.

Better error detection by stronger and super fast xxh64.

crc32 covers header+digest, digest also covers header.

Sometimes slow CRC32 impl. only used for few bytes.

Borg < 2.0 crypto

old crypto issues

  • potential nonce reuse and counter measures:
    • AES-CTR mode with 1 AES key per repo
    • counter values (IV / nonce) must never be re-used
    • complex counter management needed
    • limited trust in repository
    • local counter knowledge can be lost (e.g. disk defect)
    • multiple clients need to trust repo
    • to avoid counter issues, repos must use different keys
    • no easy replication of encrypted chunks to other repo
  • self-made layering of
    • AES256-CTR + HMAC-SHA256
    • AES256-CTR + BLAKE2b
  • there are faster ready-to-use AEAD ciphers now

borg2:  new crypto

new crypto features

  • Fixes potential nonce reuse issue:
    • random session id generated at start of a borg run
    • session key derived from session id and master key
    • counter (IV / Nonce) starts from 0 for each session
    • no counter management needed, no risk of reuse
  • OpenSSL >= 1.1.1 (including on OpenBSD), providing:
    • super fast AES256-OCB (with AES hw acceleration).
      patents first licensed to FOSS, now abandoned.
    • very fast CHACHA20-POLY1305 (pure sw implementation)
  • use AAD of AEAD cipher to protect header / chunkid
  • Argon2 KDF used for the borg key (was: pbkdf2)
  • undecided: maybe adopt BLAKE3 for the chunk id hash


  • borg < 2.0 approach:
    • first hardlink archived as regular file, with chunkid list
    • second hardlink archived:
      • refers back to first one by-name
      • does not have own chunkid list
    • problematic partial extraction, messy code, special cases
  • borg 2.0 approach:
    • hardlinks archived like a normal item
    • regular files / HLs always have chunkid list (1st, 2nd, ...)
    • if st_nlinks > 1:  item.hlid = H(st_dev, st_ino)
    • rule: hlid is same -> items point to same inode (are HLs).
    • symmetric:  1st HL archived the same way as 2nd, 3rd...


  • msgpack old spec - type confusion:
    • did not differentiate between text and binary data
    • text could be encoded, but comes back as binary
    • if you get binary, it could have been binary or text
  • msgpack new spec - roundtripping done right:
    • text (str):
      • comes back as str
      • borg uses utf-8 with surrogate-escape handler
      • gets encoded/decoded automatically
    • binary (bytes):
      • comes back as bytes
      • gets stored "as is"
  • ​borg2 uses the new spec, borg < 2.0 uses old spec.

new tar formats

These were initially intended for data migration.


ctime and atime support, all ts in ns resolution

could support more metadata, like xattrs, ACLs, ...,
but a lot of work to implement and test


Like PAX plus custom BORG.* PAX headers
for perfect round-tripping
of all borg supported fs item metadata.

Copy archive from repo1 to repo2:

borg export-tar ... repo1::A | borg import-tar ... repo2::A


Problem: no dedup, huge amount of data

borg2 transfer

Create a related new repo:

borg --repo NEWREPO rcreate --other-repo OLDREPO --encryption CIDH

Transfer archives:

borg --repo NEWREPO transfer --other-repo OLDREPO [--dry-run]


  • deduplication: transfer each chunk only once
  • no expensive re-compression / re-chunking
  • but: re-encryption is required (but fast!)
  • old chunks deduplicate with future chunks, requires: related repo (key material), CIDH = compat. ID hash
  • some (cheap) data conversions done on the fly:
    cleanups, type conversions, msgpack changes

release N+1 plans

  • in 2.0 we needed to keep some of the old stuff:
    • borg transfer needs to read old repos / archives
    • users need to transfer their archives to new repos
  • in N+1 (2.1?) we can remove:
    • AES-CTR mode, counter management code
    • old style keys (pbkdf2, Encrypt and MAC)
    • code for old repo index, old chunks index
    • PUT(1) repo code
    • zlib type bytes hack
    • bigint stuff (replaced by msgpack Timestamp)
    • msgpack-related "good that we know the type"
    • hardlink_master processing
    • old RPC protocols
    • borg transfer / item code that converts old to new

Support the Borg

Contributions are welcome!


Code, documentation, review, testing, funding, ...


Just join us on GitHub and LiberaChat IRC #borgbackup.



Donations please via LiberaPay or BountySource:


For more information:

Questions / Feedback?

  • tw @ waldmann-edv . de

  • Thomas J Waldmann @ twitter

BorgBackup 2.0 alpha

By Thomas Waldmann

BorgBackup 2.0 alpha

borgbackup update

  • 677