(diving deeper and FAQ)
Thomas Waldmann (@home, 2021-07)
Some insights into borg
architecture and data structures.
For more details, see our docs.
(borg 1.2)
previously: compact_segments()
(append, never modify in-place)
(completed on commit)
Some stuff that comes up again and again.
More details about this are in our docs.
Also, check the github issue tracker
in case you run into a problem.
"One big or multiple smaller repos?"
Pro One
Pro Multiple
"rsync / rclone vs. borg to another place?"
"rsync / rclone"
"borg directly to multiple target repos"
Borg deduplicates based on chunks (not: whole files).
buzhash chunker (content-defined chunking):
rolling hash computed over window
window rolling over whole input file in 1 byte steps
if hash(window) & bitmask == 0: cut a chunk!
fixed chunker (borg 1.2+, fixed size chunks):
cutting a block device into blocks
cutting a LV in LEs
cutting a (fixed record size) DB into records
borg create --chunker-params=PARAMS ...
buzhash,19,23,21,4095 (variable size, default)
min 2^19, max 2^23, target 2^21 bytes chunks
(produces chunks 0.5MiB <= target 2MiB <= 8MiB)
window size 4095 bytes
large chunks → low management overhead
buzhash,10,23,16,4095 (variable size)
produces chunks 1kiB <= target 64kiB <= 8MiB
small chunks → high management overhead
borg create --chunker-params=PARAMS ...
fixed,4194304 (fixed chunk size)
fixed blocks of 4MiB size (e.g. LVM LEs)
fixed,65536,4096 (fixed size w/ header)
4kiB header followed by 64kiB blocks
Faster / way less CPU than buzhash, good if contents do not shift inside the input file (no insertions / deletions).
New in borg 1.2.
small chunks:
big chunks:
problems usually on unbalanced systems:
lots of data, little RAM - amplified by small chunks.
keep in mind: each file will be at least 1 chunk!
So, if you have a lot of small files, the typical chunk size will be smaller than the chunker target size.
chunks index (client): chunkid → (refcount, size, csize)
Chunk presence detection / reference counting /
garbage collection and statistics.
If lost, can be rebuilt from archives in repo.
repo index (server): chunkid → (segment, offset)
Find segment file and offset in there to read a chunk.
If lost, can be rebuilt from segment files.
Indexes implemented in C as in-memory hashtables.
Smaller chunks → more chunks → more memory usage!
If the chunks index (client) gets out of sync
with the repo, it needs to get rebuilt.
Fast way / needs much space on client:
Use chunks.archive.d/* (cached per-archive chunks indexes) to avoid having to query remote repo.
Space requirement is O(#archives * #chunks).
Slow way / needs less space on client:
$ rm -rf chunks.archive.d ; touch chunks.archive.d
Fetches all archive metadata from remote repo
to rebuild master chunks index.
H(fullpath) → (size, ctime, inode, chunkids)
Processing a backup input file
stat(fullpath) and lookup H(fullpath) in files cache.
Miss → new or renamed file, read / chunk it and remember it in new cache entry.
Hit, but size, ctime, inode changed → file was changed, process like new file.
Hit and size, ctime, inode match → file is unchanged!
FAST: No need to read the file to add it to the archive, just use the cached chunkids!
Note: In any case, borg needs to read flags, xattrs, acls from the filesystem.
H(fullpath) is the cache lookup key.
ctime/mtime,inode,size must match.
For fast backups, make sure that:
Tweaking via: --files-cache=ctime/mtime,inode,size
If you have a lot of files,
the files cache can grow rather large
(RAM and disk space).
files not seen for N times are removed from cache.
adjust to at least # of backup input data sets.
use multiple files caches instead of a single one.
lower memory usage by keeping the files caches separate (e.g. per data set).
File(s) with newest timestamp are not put into the FC.
A failure scenario:
- newest file changed at time T
- snapshot at time T (within ts granularity)
- file changed again at time T (within ts granulary)
- borg backs up the snapshot, fc knows file with ts T
- later borg does another backup, fs file has ts T
- borg would think file is unchanged, because
files cache file timestamp T == fs file ts T
Optimisation: touch /backupdata/dummyfile
Borg does a lot of checksumming,
thus detects issues often before otherwise noted.
First, hw must work ok:
bad RAM (or CPU or mainboard): memtest86+
bad hdd / ssd: smartctl -t long / -a
replace any bad hardware
borg check [--repair] REPO
tw @ waldmann-edv . de
Thomas J Waldmann @ twitter