Unicode
10
2017-09-01 pixiv Inc. Study Session
@hakatashi

CJK UNIFIED
IDEOGRAPHS
EXTENSION
F
Unicode
10.0.0
Release: June 2017
8,518 NEW characters
Unicode 10
Highlight
HENTAIGANA

HENTAIGANA
- Archaic form of Japanese Hiragana
- Transitional form between Kanji and current style of Hiragana
- Still in use with some Japanese traditional proper names
- Has caused much controversy while addition to Unicode...

HENTAIGANA IS
TRANSITIONAL
禮
れ


U+79AE
U+????
U+????
U+308C
Are these the same characters?
U+1B0FE
U+1B0FF
PREVIOUS VERSIONS' (ALMOST) MISS-ENCODING OF KANA SYMBOLS


Unicode 9
Unicode 10
CJK UNIFIED IDEOGRAPHS EXTENSION-F
8,518 NEW characters
Unicode 10
88% (7,494) OF THESE ARE 漢字!!!
and
CJK UNIFIED IDEOGRAPHS EXTENSION-F

7,473 characters
How these characters were chosen?

ISO/IEC JTC 1
ISO/IEC JTC 1/SC 2
ISO/IEC JTC 1/SC 2/WG 2
Ideographic Rapporteur Group (IRG)
Who is making Unicode™?
Japan
Committee
Korea
Committee
SAT
Committee
Unicode Consortium
情報規格調査会
(ITSCJ)
情報処理学会
(IPSJ)
History
- 2012-06 (IRG38) Plan for composition of Ext. F was approved
- 2012-10 (IRG39) Character submission of each committee & Ext. F ver 1.0 released
- 2013-11 (IRG41) Ext. F ver 2.0
- 2014-05 (IRG42) Ext. F ver 3.0
- 2014-11 (IRG43 & UTC141) Ext. F ver 4.0 & Submission to Unicode
- 2016-01 (UTC146) Unicode 9.0.0 BETA issued and inclusion of Ext. F was postponed
- 2016-08 (UTC148) Ext. F was approved for Unicode
- 2017-06 Unicode 10.0.0 released
MORE REVIEW, MORE VALIDATION
SO TROUBLESOME?

NO MORE CHAOS
ISO/IEC JTC 1
ISO/IEC JTC 1/SC 2
ISO/IEC JTC 1/SC 2/WG 2
Ideographic Rapporteur Group (IRG)
Who is making Unicode™?
Japan
Committee
China
Committee
SAT
Committee
Unicode Consortium
情報規格調査会
(ITSCJ)
情報処理学会
(IPSJ)
IRG JAPAN SUBCOMMITTEE
- During the previous activity of standardization, most of the 漢字 needed to describe Japanese text are already included in Unicode.
- Other 漢字 originated from China were submitted by the other sub-committees. (and will be)
- So, current activity of Japan sub-committee is mainly to submit (strange) Japanese 国字 for addition to Unicode.
But...
Poor Sources

7,473
Ext. F characters
1,645
submitted by Japan
(originated in 文字情報基盤整備事業)
909
source identified
※http://kanji.zinbun.kyoto-u.ac.jp/~yasuoka/publications/2017-08-04.pdf

Some Examples

U+2D92A
「はかた」

U+2CF01
「ダラー」

U+2E282
「ぼんのう」

U+2D475
「いと」
(上方語で「長女」のことか)
BEYOND...
UpCOMING...?




End

Unicode 10
By Koki Takahashi
Unicode 10
2017-09-01 Weekly Study Session in pixiv Inc.
- 1,758