3e Character Encoding
- Understand what a character set is and be able to describe the character encoding methods: 7-bit ASCII; Unicode.
- Understand that character codes are commonly grouped and run in sequence within encoding tables.
- Describe the purpose of Unicode and the advantages of Unicode over ASCII.
- Know that Unicode uses the same codes as ASCII up to 127.
American Standard Code for Information Interchange
A character set is a list of characters with a binary number code that represents that character.
A
100 0001
B
100 0010
E
100 0101
F
100 0110
C
100 0011
D
100 0100
a
110 0001
b
110 0010
c
110 0011
d
110 0100
e
110 0101
f
110 0110
Example
Computing = 1000011 1101111 1101101 1110000 1110101 1110100 1101001 1101110 1100111
ASCII
ASCII
Example
Computing = 67, 111, 109, 112, 117, 116, 105, 110, 103
Your first name in ASCII
David = 68 97 118 105 100
What's the word?
69 110 99 111 100 101 100
Encoded
What's the character?
If the ASCII code for 'a' is 97, what is the code for 'g'?
If the ASCII code 110 is 'n', what character does the code 107 represent?
If the ASCII code for 'A' in binary is 100 0001, what is the code for 'D' in binary?
If the ASCII code 110 0100 is 'd', what character does the binary code 110 0110 represent?
103
k
100 0100
f
Unicode
ASCII
Unicode
- 128 characters (7-bit ASCII)
- 256 characters (8-bit ASCII)
- Smaller memory requirement
- Cannot represent most languages
- 1 114 111 characters (32-bit Unicode)
- ASCII characters have the same code in Unicode
- A is 41 (Hex value in ASCII)
- A is 0041 (Hex value in Unicode)
- Larger memory requirement
- Can represent all languages
- Only 10% of available characters currently used
- Chinese, Arabic, Greek...
3e Character Encoding
By David James
3e Character Encoding
Computer Science - Fundamentals of Data Representation - Character Encoding
- 518