3h Data Compression
3h Data Compression
- Explain what data compression is.
- Understand why data may be compressed and that there are different ways to compress data.
- Explain how data can be compressed using Huffman coding.
- Be able to interpret/create Huffman trees.
- Be able to calculate the number of bits required to store a piece of data compressed using Huffman coding.
- Be able to calculate the number of bits required to store a piece of uncompressed data in ASCII.
- Explain how data can be compressed using run length encoding (RLE).
- Represent data in RLE frequency/data pairs.
Data Compression
- Data compression is the reduction in file size to reduce download times and storage requirements.
- Compression results in smaller file sizes and faster transfer of data around a network.
- Compression is achieved by removing the repetition of identical sets of data bits.
Data Compression
- Data compression is the reduction in file size to reduce download times and storage requirements.
- Compression results in smaller file sizes and faster transfer of data around a network.
- Compression is achieved by removing the repetition of identical sets of data bits.
Huffman Coding
Use Huffman encoding to compress "Mississippi river"
Huffman Coding
Mississippi river
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
Huffman Coding
Mississippi river
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
Huffman Coding
Mississippi river
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
2
2
Huffman Coding
Mississippi river
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
2
2
Huffman Coding
Mississippi river
4
4
2
2
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
Huffman Coding
Mississippi river
4
4
2
2
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
Huffman Coding
Mississippi river
8
4
4
2
2
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
9
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
Huffman Coding
Mississippi river
17
8
4
4
2
2
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
9
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
Huffman Coding
Mississippi river
17
8
4
4
2
2
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
9
1
e
1
v
1
r
2
space
1
p
2
s
4
i
5
M
1
0
1
0
1
0
1
0
1
0
1
0
1
0
Huffman Coding
Mississippi river
17
8
4
4
2
2
i
5
s
4
p
2
r
2
M
1
v
1
e
1
space
1
9
1
0
1
0
1
0
1
0
1
0
1
0
1
0
M
1100
e
v
r
space
p
s
i
00
01
100
1111
101
1101
1110
Huffman Coding
Mississippi river
1100000101000101001001000011111010011011110101
Huffman encoded:
46 bits
8-bit ascii code:
17 characters × 8 bits = 136 bits
M
1100
e
v
r
space
p
s
i
00
01
100
1111
101
1101
1110
Repeated characters will compress more.
Example
Use Huffman encoding to compress "access"
a = 1
c = 2
e = 1
s = 2
c
2
s
2
a
1
e
1
2
Example
Use Huffman encoding to compress "access"
a = 1
c = 2
e = 1
s = 2
a = 110
c = 0
e = 111
s = 10
c
2
s
2
a
1
e
1
2
4
6
1
0
0
0
1
1
access
8-bit ascii: 6 × 8 = 48 bits
Huffman: 3 + 1 + 1 + 3 + 2 + 2 = 12 bits
Run Length Encoding
Run length encoding (RLE) works well with repeated data.
It is especially effective with bitmap images because you get blocks of the same colour.
Run Length Encoding
Consider this 2 colour bitmap:
In binary:
1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 1 1
Image size = 6 × 5 × 1 = 30 bits
Run Length Encoding
Consider this 2 colour bitmap:
Image size = 6 × 5 × 1 = 30 bits
Could write this as:
7B
4W
2B
4W
2B
4W
7B
or:
71
40
21
40
21
40
71
Convert to RLE Format
1B
1B
1B
Write as number of continuous colours:
9W
6B
3W
1B
2W
then:
90 61 30 11 20 11 40 11 20 11 40 11 20 11 40 11 20 11 100
4W
1B
2W
1B
4W
2W
1B
2W
10W
4W
1B
Draw from RLE Format
Split the data in to number-colour pairs:
then replace the 0 with white and the 1 with black (or whatever colours you are given):
1 0 4 1 5 0 1 1 4 0 1 1 4 0 1 1 4 0 1 1 5 0 4 1 1 0
10 41 50 11 40 11 40 11 40 11 50 41 10
1B
4W
1B
5W
4B
1W
1W
4B
5W
1B
4W
1B
4W
Questions
- In computing explain what compression is.
- Show how to compress this diagram using run length encoding (RLE).
- Show how to compress 'forever' using Huffman coding.
- What are the two forms of compression you need to know and explain?
Questions
- In computing explain what compression is.
- What are the two forms of compression you need to know and explain?
Using programming/algorithms to reduce the storage requirements for a set of data (text/image/video/sound).
Huffman encoding
Run length encoding
Questions
- Show how to compress 'forever' using Huffman coding.
f = 1
o = 1
r = 2
e = 2
v = 1
e:2
r:2
f:1
o:1
v:1
2
3
4
7
1
0
0
0
0
1
1
1
forever: 10 110 01 00 111 00 01
Questions
- Show how to compress this diagram using run length encoding (RLE).
5b 2w 2b 2w 6b 2w 2b 2w 5b
51 20 21 20 61 20 21 20 51
3h Data Compression
- Explain what data compression is.
- Understand why data may be compressed and that there are different ways to compress data.
- Explain how data can be compressed using Huffman coding.
- Be able to interpret/create Huffman trees.
- Be able to calculate the number of bits required to store a piece of data compressed using Huffman coding.
- Be able to calculate the number of bits required to store a piece of uncompressed data in ASCII.
- Explain how data can be compressed using run length encoding (RLE).
- Represent data in RLE frequency/data pairs.
3h Data Compression
By David James
3h Data Compression
Computer Science - Fundamentals of Data Representation - Data Compression
- 407