Compression Carcinized
implementing zlib-rs
Folkert de Vries, RustNL 2024
🦀
🦞
a form of convergent evolution in which non-crab crustaceans evolve a crab-like body plan
carcinization
➡️
zlib: a library you've used today
zlib: a library you've used today
zlib: a library you've used today
> objdump -T /usr/lib/x86_64-linux-gnu/libz.so | grep "compress"
0000000000010370 g DF .text 0000000000000022 ZLIB_1.2.0 compressBound
0000000000010360 g DF .text 000000000000000f Base compress
0000000000010220 g DF .text 000000000000013d Base compress2
0000000000010560 g DF .text 000000000000001c Base uncompress
00000000000103a0 g DF .text 00000000000001c0 ZLIB_1.2.9 uncompress2
pub unsafe extern "C" fn compress2(
dest: *mut Bytef,
destLen: *mut c_ulong,
source: *const Bytef,
sourceLen: c_ulong,
level: c_int
) -> c_int
project goals
drop-in replacement for the zlib dynamic library
high-performance implementation for rust
📦
📚
Topics
- crash course compression
- the zlib ecosystem
- ponderings on porting
crash course compression
when few do trick?
Why use many byte
Why Compress?
cost
speed
🚀
💰
Lossless Compression
assert_eq!(decompress(compress(data)), data)
Recognizing patterns
foobarfoo
⬇️
foobar<offset = 6, len = 3>
Finding patterns
3.14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223176
goal: find the (longest) <offset,len> insertions
Finding patterns
s
e
r
i
e
u
s
p
r
o
⬆️
Finding patterns
s
e
r
i
e
u
s
p
r
o
⬆️
Finding patterns
The window size determines how far back the offset can go
s
e
r
i
e
u
s
p
r
o
⬆️
Finding patterns
The compression level determines how hard we try to find the longest match
f
o
o
...
f
o
o
o
f
o
⬆️
o
o
Finding patterns
Finding patterns
f
o
o
b
a
r
f
o
o
...
⬆️
Finding patterns
f
o
o
b
a
r
f
o
o
...
⬆️
"foo" -> { 0 }
Finding patterns
f
o
o
b
a
r
f
o
o
...
⬆️
"foo" -> { 0 }
"oob" -> { 1 }
Finding patterns
f
o
o
b
a
r
f
o
o
...
⬆️
"foo" -> { 0 }
"oob" -> { 1 }
"oba" -> { 2 }
"bar" -> { 3 }
"arf" -> { 4 }
"rfo" -> { 5 }
Finding patterns
very effective for web data, even at low compression levels
🌐
Streaming
zlib can stream compression and decompression
🏞️
zlib & rust
what are we even implementing
zlib-adler: the OG
🏛️
goal: stability
still supports 16-bit systems
does not use modern hardware well
zlib-ng: the next generation
🚀
goal: performance
removes legacy,
but API-compatible
uses SIMD to speed up the algorithm
why rust
🎯
reduced surface area
why rust
Any sufficiently complicated C project
contains an
ad hoc,
informally-specified,
bug-ridden,
slow
implementation of half of cargo
flate2
a nice rust API for zlib
used in cargo
miniz-oxide: better safe than sorry
🛡️
goal: safety
a safe (but slow) rust implementation
does not cover the full zlib API
zlib-rs: a safer zlib
⚙️
goal: safety & performance
faster through the use of SIMD
implements the full zlib API
unsafe sandwich
unsafe C API
unsafe SIMD
(mostly) safe business logic
compression speed
progress
[dependencies]
flate2 = {
version = "1.0.29",
default-features = false,
features = ["zlib-rs"]
}
early days, but give it a go!
a crab-like body plan
🦀
🌊
➡️
from C to rust
spectrum of porting
implementation
rewrite
🏭
🌳
spectrum of porting
implementation
rewrite
🏭
🌳
⬆️
implementation
- e.g. rustls, ntpd-rs
- reinvent the wheel
- innovate on architecture
- high risk, high reward
spectrum of porting
implementation
rewrite
🏭
🌳
implementation
- e.g. rustls, ntpd-rs
- reinvent the wheel
- innovate on architecture
- high risk, high reward
⬆️
Rewrite
- e.g. rav1d
- use existing knowledge
- inherits architecture
- works on day 1
spectrum of porting
implementation
rewrite
🏭
🌳
// ...
i = 0 as libc::c_int;
while i < nblock {
ftab[*eclass8.offset(i as isize) as usize] += 1;
ftab[*eclass8.offset(i as isize) as usize];
i += 1;
i;
}
// ...
spectrum of porting
implementation
rewrite
🏭
🌳
zlib
- reuse existing knowledge (correctness & performance)
- quick results
- architecture is constrained anyway
⬆️
RiiR and You
compatability
just better
🚀
🧩
RiiR and You
funding
adoption
👪
💰
🤖
Summary
why use many bytes when few do trick
unreasonably effective on web content
use more (unglamorous) rust in production
🦀evolve crab-like body plans🦀
try zlib-rs
Thanks
Benchmark 1 (42 runs): target/release/examples/compress 1 rs silesia-small.tar
measurement mean ± σ min … max outliers delta
wall_time 119ms ± 1.97ms 117ms … 128ms 1 ( 2%) 0%
peak_rss 26.7MB ± 85.7KB 26.6MB … 26.9MB 0 ( 0%) 0%
cpu_cycles 406M ± 4.67M 399M … 424M 1 ( 2%) 0%
instructions 660M ± 469 660M … 660M 0 ( 0%) 0%
cache_references 8.06M ± 1.31M 5.65M … 11.3M 0 ( 0%) 0%
cache_misses 461K ± 36.5K 433K … 555K 5 (12%) 0%
branch_misses 3.59M ± 6.42K 3.58M … 3.61M 1 ( 2%) 0%
Benchmark 2 (43 runs): removed-bounds/release/examples/compress 1 rs silesia-small.tar
measurement mean ± σ min … max outliers delta
wall_time 118ms ± 2.53ms 115ms … 127ms 3 ( 7%) - 1.3% ± 0.8%
peak_rss 26.8MB ± 77.9KB 26.7MB … 27.0MB 0 ( 0%) + 0.2% ± 0.1%
cpu_cycles 400M ± 8.00M 391M … 437M 2 ( 5%) - 1.4% ± 0.7%
instructions 623M ± 522 623M … 623M 0 ( 0%) ⚡ - 5.6% ± 0.0%
cache_references 7.91M ± 1.45M 5.89M … 11.9M 1 ( 2%) - 1.9% ± 7.4%
cache_misses 458K ± 29.1K 433K … 550K 1 ( 2%) - 0.5% ± 3.1%
branch_misses 3.34M ± 7.35K 3.33M … 3.36M 0 ( 0%) ⚡ - 6.8% ± 0.1%
is it bounds checks?
Lossy Compression
622kb
Atlantic Ghost Crab © Hans Hillewaert
Lossy Compression
622kb ➡ 87kb
Compression Carcinized
By folkert de vries
Compression Carcinized
RustNL 2024
- 86