Unicode
10
2017-09-01 pixiv Inc. Study Session
@hakatashi
CJK UNIFIED
IDEOGRAPHS
EXTENSION
F
Unicode
10.0.0
Release: June 2017
8,518 NEW characters
Unicode 10
Highlight
HENTAIGANA
HENTAIGANA
- Archaic form of Japanese Hiragana
- Transitional form between Kanji and current style of Hiragana
- Still in use with some Japanese traditional proper names
- Has caused much controversy while addition to Unicode...
HENTAIGANA IS
TRANSITIONAL
禮
れ
U+79AE
U+????
U+????
U+308C
Are these the same characters?
U+1B0FE
U+1B0FF
PREVIOUS VERSIONS' (ALMOST) MISS-ENCODING OF KANA SYMBOLS
Unicode 9
Unicode 10
CJK UNIFIED IDEOGRAPHS EXTENSION-F
8,518 NEW characters
Unicode 10
88% (7,494) OF THESE ARE 漢字!!!
and
CJK UNIFIED IDEOGRAPHS EXTENSION-F
7,473 characters
How these characters were chosen?
ISO/IEC JTC 1
ISO/IEC JTC 1/SC 2
ISO/IEC JTC 1/SC 2/WG 2
Ideographic Rapporteur Group (IRG)
Who is making Unicode™?
Japan
Committee
Korea
Committee
SAT
Committee
Unicode Consortium
情報規格調査会
(ITSCJ)
情報処理学会
(IPSJ)
History
- 2012-06 (IRG38) Plan for composition of Ext. F was approved
- 2012-10 (IRG39) Character submission of each committee & Ext. F ver 1.0 released
- 2013-11 (IRG41) Ext. F ver 2.0
- 2014-05 (IRG42) Ext. F ver 3.0
- 2014-11 (IRG43 & UTC141) Ext. F ver 4.0 & Submission to Unicode
- 2016-01 (UTC146) Unicode 9.0.0 BETA issued and inclusion of Ext. F was postponed
- 2016-08 (UTC148) Ext. F was approved for Unicode
- 2017-06 Unicode 10.0.0 released
MORE REVIEW, MORE VALIDATION
SO TROUBLESOME?
NO MORE CHAOS
ISO/IEC JTC 1
ISO/IEC JTC 1/SC 2
ISO/IEC JTC 1/SC 2/WG 2
Ideographic Rapporteur Group (IRG)
Who is making Unicode™?
Japan
Committee
China
Committee
SAT
Committee
Unicode Consortium
情報規格調査会
(ITSCJ)
情報処理学会
(IPSJ)
IRG JAPAN SUBCOMMITTEE
- During the previous activity of standardization, most of the 漢字 needed to describe Japanese text are already included in Unicode.
- Other 漢字 originated from China were submitted by the other sub-committees. (and will be)
- So, current activity of Japan sub-committee is mainly to submit (strange) Japanese 国字 for addition to Unicode.
But...
Poor Sources
7,473
Ext. F characters
1,645
submitted by Japan
(originated in 文字情報基盤整備事業)
909
source identified
※http://kanji.zinbun.kyoto-u.ac.jp/~yasuoka/publications/2017-08-04.pdf
Some Examples
U+2D92A
「はかた」
U+2CF01
「ダラー」
U+2E282
「ぼんのう」
U+2D475
「いと」
(上方語で「長女」のことか)
BEYOND...
UpCOMING...?
End
Unicode 10
By Koki Takahashi
Unicode 10
2017-09-01 Weekly Study Session in pixiv Inc.
- 1,647