Unicode
????? ???? ??????????
ASCII
2**8 = 256 symbols
YUSCII
КОИ-7
There are a lot of them
- Can't all symbols be added
- Depending on code
- Font for each code
1991
Unicode
UTF-16
2**16 = 65536 symbols
UTF-32 and UTF-8
Too simple)
ä !== ä
й !== й
a !== a
Normalization algorithms
NFD, NFC, NFKD, NFKC
BOM
What is faster:
UTF-16 or UTF-8
UTF-16 (Java)
What takes more space: English or Russian text
Russian
What does it mean for JavaScript?
"I \u2661 JavaScript"
"💩"
"𠮷"
"💩" === 1F4A9
"\u{1F4A9}"
"\uD83D\uDCA9"
"💩".length === 2
"𝐀".length === 2
"\u{1D400}"
Any solutions?
Array.from
punycode
Array.from(document.querySelectorAll("a"))
Array.from("asd").length === 3
Array.from("💩").length === 1
Done)
"mañana" !== "mañana"
"ma\xF1ana" !== "man\u0303ana"
Any solutions?
String.prototype.normalize
"mañana".normalize("NFC") === "mañana".normalize("NFC")
Done)
Reverse string
string.split('').reverse().join('')
"mañana".split('').reverse().join('')
"anañam"
"mañana".split('').reverse().join('')
"anãnam"
"💩".split('').reverse().join('')
"��"
Any solutions?
Array.from
Normalize
codeAt => codePointAt
fromCharCode => fromCharPoint
Regex
/./.test('💩')
/./.test('💩') === true
/^./.test('💩')
/^./.test('💩') === true
/^.$/.test('💩')
/^.$/.test('💩') === false
Any solutions?
/^.$/u.test('💩') === true
💩 => \u{1F4A9}
💪=> \u{1F4AA}
💫 => \u{1F4AB}
💬 => \u{1F4AC}
💭 => \u{1F4AD}
/[💩-💭]/.test('💫')
Error(
/[💩-💭]/u.test('💫') === true
Questions?
Unicode
By Vladimir
Unicode
- 175