(diving deeper and FAQ)
Thomas Waldmann (@home, 2021-07)
Some insights into borg
architecture and data structures.
For more details, see our docs.
(borg 1.2)
previously: compact_segments()
Ideas:
log-like
(append, never modify in-place)
transactions
(completed on commit)
Some stuff that comes up again and again.
More details about this are in our docs.
Also, check the github issue tracker
in case you run into a problem.
"One big or multiple smaller repos?"
Pro One
Pro Multiple
"rsync / rclone vs. borg to another place?"
"rsync / rclone"
"borg directly to multiple target repos"
Borg deduplicates based on chunks (not: whole files).
buzhash chunker (content-defined chunking):
rolling hash computed over window
window rolling over whole input file in 1 byte steps
if hash(window) & bitmask == 0: cut a chunk!
fixed chunker (borg 1.2+, fixed size chunks):
cutting a block device into blocks
cutting a LV in LEs
cutting a (fixed record size) DB into records
borg create --chunker-params=PARAMS ...
buzhash,19,23,21,4095 (variable size, default)
min 2^19, max 2^23, target 2^21 bytes chunks
(produces chunks 0.5MiB <= target 2MiB <= 8MiB)
window size 4095 bytes
large chunks → low management overhead
buzhash,10,23,16,4095 (variable size)
produces chunks 1kiB <= target 64kiB <= 8MiB
small chunks → high management overhead
borg create --chunker-params=PARAMS ...
fixed,4194304 (fixed chunk size)
fixed blocks of 4MiB size (e.g. LVM LEs)
fixed,65536,4096 (fixed size w/ header)
4kiB header followed by 64kiB blocks
Faster / way less CPU than buzhash, good if contents do not shift inside the input file (no insertions / deletions).
New in borg 1.2.
small chunks:
big chunks:
problems usually on unbalanced systems:
lots of data, little RAM - amplified by small chunks.
keep in mind: each file will be at least 1 chunk!
So, if you have a lot of small files, the typical chunk size will be smaller than the chunker target size.
chunks index (client): chunkid → (refcount, size, csize)
Chunk presence detection / reference counting /
garbage collection and statistics.
If lost, can be rebuilt from archives in repo.
repo index (server): chunkid → (segment, offset)
Find segment file and offset in there to read a chunk.
If lost, can be rebuilt from segment files.
Indexes implemented in C as in-memory hashtables.
Smaller chunks → more chunks → more memory usage!
If the chunks index (client) gets out of sync
with the repo, it needs to get rebuilt.
Fast way / needs much space on client:
Use chunks.archive.d/* (cached per-archive chunks indexes) to avoid having to query remote repo.
Space requirement is O(#archives * #chunks).
Slow way / needs less space on client:
$ rm -rf chunks.archive.d ; touch chunks.archive.d
Fetches all archive metadata from remote repo
to rebuild master chunks index.
H(fullpath) → (size, ctime, inode, chunkids)
Processing a backup input file
stat(fullpath) and lookup H(fullpath) in files cache.
Miss → new or renamed file, read / chunk it and remember it in new cache entry.
Hit, but size, ctime, inode changed → file was changed, process like new file.
Hit and size, ctime, inode match → file is unchanged!
FAST: No need to read the file to add it to the archive, just use the cached chunkids!
Note: In any case, borg needs to read flags, xattrs, acls from the filesystem.
H(fullpath) is the cache lookup key.
ctime/mtime,inode,size must match.
For fast backups, make sure that:
Tweaking via: --files-cache=ctime/mtime,inode,size
If you have a lot of files,
the files cache can grow rather large
(RAM and disk space).
env var BORG_FILES_CACHE_TTL [20]
files not seen for N times are removed from cache.
adjust to at least # of backup input data sets.
env var BORG_FILES_CACHE_SUFFIX [None]
use multiple files caches instead of a single one.
lower memory usage by keeping the files caches separate (e.g. per data set).
File(s) with newest timestamp are not put into the FC.
A failure scenario:
- newest file changed at time T
- snapshot at time T (within ts granularity)
- file changed again at time T (within ts granulary)
- borg backs up the snapshot, fc knows file with ts T
- later borg does another backup, fs file has ts T
- borg would think file is unchanged, because
files cache file timestamp T == fs file ts T
Optimisation: touch /backupdata/dummyfile
Borg does a lot of checksumming,
thus detects issues often before otherwise noted.
First, hw must work ok:
bad RAM (or CPU or mainboard): memtest86+
bad hdd / ssd: smartctl -t long / -a
replace any bad hardware
Then:
borg check [--repair] REPO
tw @ waldmann-edv . de
Thomas J Waldmann @ twitter