GZIP & Brotli
Yatharth Khatri
Design Systems and Frontend Architect
Classical Pianist
GitHub: yatharthk
Twitter: yatharthkhatri
Always nice to understand the under-the-hood concepts of tools and tech that we use every day
You cannot build better, if you don't know the already built.
"You cannot understand everything but you should always try to understand the system."
- Ryan Dahl, Creator of NodeJS
GZip is a loss-less data-compression tool.
(and it's not new)
WEB 2.0
Uses an algorithm called "DEFLATE"
LZ77
(Invented by Lempel and Ziv in 1977)
Huffman Coding
(Invented by David Huffman in 1950s)
WEB 2.0
function add(number1, number2) {
return number1 + number2;
}
function subtract(number1, number2) {
return number1 - number2;
}
function multiply(number1, number2) {
return number1 * number2;
}
function divide(number1, number2) {
return number1 / number2;
}
export default {
add,
subtract,
multiply,
divide
};
LZ77
Huffman codes
GZip Code
function add(number1, number2) {
return number1 + number2;
}
function subtract(number1, number2) {
return number1 - number2;
}
function multiply(number1, number2) {
return number1 * number2;
}
function divide(number1, number2) {
return number1 / number2;
}
export default {
add,
subtract,
multiply,
divide
};
LZ77
Huffman codes
Server
Client (eg browser)
GZIP
Mr. Buffer
Mr. Sliding Window
Mr. Sliding Window
Smart. And does all logical, heavy stuff.
Mr. Sliding Window
32 KB
32kb capacity box
Bag for backup
Task: Reduce the text as much as possible without loss of data
Came up with a solution and asked for:
Mr. Sliding Window
32 KB
32kb capacity box
Bag for backup
The solution:
Mr. Buffer
Mr. Sliding Window
Can only read and pass...
(text to be compressed)
Mr. Buffer
Mr. Sliding Window
ABCABC
A
B
A
B
C
C
A
C
B
A
B
C
<3, 3>
<3, 3> = Go 3 chars back and copy 3 chars
Did you notice the limitation?
If the char or phrase does not appear in the last 32kB of data stored, it cannot be back-referenced.
(Also called variable length encoding.)
Huffman Coding reduces the regular byte size of your code.
For ex.
AAABCAD
(7B or 56bits)
AAABCAD
(~50 bits)
Let's do simple math
AAABCAD
(7B or 56bits)
We have fixed byte size in computing (8 bits for each char)
Give shortest possible bit size to most frequently appeared characters and longest to the least frequently appreared
Dictionary
But we need to send dictionary as well, 😯
which is 41 bits
Total = 52 bits
< 56 bits
function add(number1, number2) {
return number1 + number2;
}
function subtract(number1, number2) {
return number1 - number2;
}
function multiply(number1, number2) {
return number1 * number2;
}
function divide(number1, number2) {
return number1 / number2;
}
export default {
add,
subtract,
multiply,
divide
};
RAW - 343B
function add(t,u){return t+u}function subtract(t,u){return t-u}function multiply(t,u){return t*u}function divide(t,u){return t/u}export default{add:add,subtract:subtract,multiply:multiply,divide:divide};
Minified - 204B
function add(t,u){return t+u}` 4$subtract` 4)-` 7&multiply` V)*` Y÷` v)/u}export default{add:add,`!*#:`!3#,` z#:`!##,` l!:` s!};After LZ77 (Using JS implementation of LZ77) - 134B
����]math.js.lz77�A
�0��L�S#.��Bw�1-�L����t��[<���@aJ`�;�f�gO�Ѕ����Z]�͇#�9n/�"�kpaU7.�;O������^��N�h�Aq��wH�� ,�t��U
�After Huffman Coding (Complete GZip) - 152B
GZip (in binary) - 152B
I hope you did learn few good things today