3e Character Encoding

  • Understand what a character set is and be able to describe the character encoding methods: 7-bit ASCII; Unicode.
  • Understand that character codes are commonly grouped and run in sequence within encoding tables.
  • Describe the purpose of Unicode and the advantages of Unicode over ASCII.
  • Know that Unicode uses the same codes as ASCII up to 127.

American Standard Code for Information Interchange

A character set is a list of characters with a binary number code that represents that character. 

A

100 0001

B

100 0010

E

100 0101

F

100 0110

C

100 0011

D

100 0100

a

110 0001

b

110 0010

c

110 0011

d

110 0100

e

110 0101

f

110 0110

Example 

Computing = 1000011 1101111 1101101 1110000 1110101 1110100 1101001 1101110 1100111

ASCII

ASCII

Example 

Computing = 67, 111, 109, 112, 117, 116, 105, 110, 103

Your first name in ASCII

David = 68  97  118  105  100

What's the word?

69  110  99  111  100  101  100

Encoded

What's the character?

If the ASCII code for 'a' is 97, what is the code for 'g'?

If the ASCII code 110 is 'n', what character does the code 107 represent?

If the ASCII code for 'A' in binary is 100 0001, what is the code for 'D' in binary?

If the ASCII code 110 0100 is 'd', what character does the binary code 110 0110 represent?

103

k

100 0100

f

Unicode

ASCII

Unicode

  • 128 characters (7-bit ASCII)
  • 256 characters (8-bit ASCII)
  • Smaller memory requirement
  • Cannot represent most languages
  • 1 114 111 characters (32-bit Unicode)
  • ASCII characters have the same code in Unicode
    • A is 41 (Hex value in ASCII)
    • A is 0041 (Hex value in Unicode)
  • Larger memory requirement
  • Can represent all languages
    • Only 10% of available characters currently used
    • Chinese, Arabic, Greek... 

3e Character Encoding

By David James

3e Character Encoding

Computer Science - Fundamentals of Data Representation - Character Encoding

  • 506