Unicode

????? ???? ??????????

ASCII

2**8 = 256 symbols

YUSCII

КОИ-7

There are a lot of them

  • Can't all symbols be added 
  • Depending on code
  • Font for each code

1991

Unicode

UTF-16

2**16 = 65536 symbols

UTF-32 and UTF-8

Too simple)

ä !== ä

й !== й

a !== a

Normalization algorithms

NFD, NFC, NFKD, NFKC

BOM

What is faster:

UTF-16 or UTF-8

UTF-16 (Java)

What takes more space: English or Russian text

Russian

What does it mean for JavaScript?

"I \u2661 JavaScript"

"💩"

"𠮷"

"💩" === 1F4A9

"\u{1F4A9}"

"\uD83D\uDCA9"

"💩".length === 2

"𝐀".length === 2

"\u{1D400}"

Any solutions?

Array.from

punycode

Array.from(document.querySelectorAll("a"))

Array.from("asd").length === 3

Array.from("💩").length === 1

Done)

"mañana" !== "mañana"

"ma\xF1ana" !== "man\u0303ana"

Any solutions?

String.prototype.normalize

"mañana".normalize("NFC") === "mañana".normalize("NFC")

Done)

Reverse string

string.split('').reverse().join('')

"mañana".split('').reverse().join('')

"anañam"

"mañana".split('').reverse().join('')

"anãnam"

"💩".split('').reverse().join('')

"��"

Any solutions?

Array.from

Normalize

codeAt => codePointAt

fromCharCode => fromCharPoint

Regex

/./.test('💩')

/./.test('💩') === true

/^./.test('💩')

/^./.test('💩') === true

/^.$/.test('💩')

/^.$/.test('💩') === false

Any solutions?

/^.$/u.test('💩') === true

💩 => \u{1F4A9}

💪=> \u{1F4AA}

💫 => \u{1F4AB}

💬 => \u{1F4AC}

💭 => \u{1F4AD}

/[💩-💭]/.test('💫')

Error(

/[💩-💭]/u.test('💫') === true

Questions?

Unicode

By Vladimir

Unicode

  • 175