Unicode

10

2017-09-01 pixiv Inc. Study Session

@hakatashi

CJK UNIFIED
IDEOGRAPHS
EXTENSION

F

Unicode
10.0.0

Release: June 2017

8,518 NEW characters

Unicode 10

Highlight

HENTAIGANA

HENTAIGANA

  • Archaic form of Japanese Hiragana
  • Transitional form between Kanji and current style of Hiragana
  • Still in use with some Japanese traditional proper names
  • Has caused much controversy while addition to Unicode...

HENTAIGANA IS
TRANSITIONAL

U+79AE

U+????

U+????

U+308C

Are these the same characters?

U+1B0FE

U+1B0FF

PREVIOUS VERSIONS' (ALMOST) MISS-ENCODING OF KANA SYMBOLS

Unicode 9

Unicode 10

CJK UNIFIED IDEOGRAPHS EXTENSION-F

8,518 NEW characters

Unicode 10

88% (7,494) OF THESE ARE 漢字!!!

and

CJK UNIFIED IDEOGRAPHS EXTENSION-F

7,473 characters

How these characters were chosen?

ISO/IEC JTC 1

ISO/IEC JTC 1/SC 2

ISO/IEC JTC 1/SC 2/WG 2

Ideographic Rapporteur Group (IRG)

Who is making Unicode™?

Japan
Committee

Korea
Committee

SAT
Committee

Unicode Consortium

情報規格調査会
(ITSCJ)

情報処理学会
(IPSJ)

History

  • 2012-06 (IRG38) Plan for composition of Ext. F was approved
  • 2012-10 (IRG39) Character submission of each committee & Ext. F ver 1.0 released
  • 2013-11 (IRG41) Ext. F ver 2.0
  • 2014-05 (IRG42) Ext. F ver 3.0
  • 2014-11 (IRG43 & UTC141) Ext. F ver 4.0 & Submission to Unicode
  • 2016-01 (UTC146) Unicode 9.0.0 BETA issued and inclusion of Ext. F was postponed
  • 2016-08 (UTC148) Ext. F was approved for Unicode
  • 2017-06 Unicode 10.0.0 released

MORE REVIEW, MORE VALIDATION

SO TROUBLESOME?

NO MORE CHAOS

ISO/IEC JTC 1

ISO/IEC JTC 1/SC 2

ISO/IEC JTC 1/SC 2/WG 2

Ideographic Rapporteur Group (IRG)

Who is making Unicode™?

Japan
Committee

China
Committee

SAT
Committee

Unicode Consortium

情報規格調査会
(ITSCJ)

情報処理学会
(IPSJ)

IRG JAPAN SUBCOMMITTEE

  • During the previous activity of standardization, most of the 漢字 needed to describe Japanese text are already included in Unicode.
  • Other 漢字 originated from China were submitted by the other sub-committees. (and will be)
  • So, current activity of Japan sub-committee is mainly to submit (strange) Japanese 国字 for addition to Unicode.

But...

Poor Sources

7,473
Ext. F characters

1,645
submitted by Japan

(originated in 文字情報基盤整備事業)

909
source identified

※http://kanji.zinbun.kyoto-u.ac.jp/~yasuoka/publications/2017-08-04.pdf

Some Examples

U+2D92A

「はかた」

U+2CF01

「ダラー」

U+2E282

「ぼんのう」

U+2D475

「いと」

(上方語で「長女」のことか)

BEYOND...

UpCOMING...?

End

Unicode 10

By Koki Takahashi

Unicode 10

2017-09-01 Weekly Study Session in pixiv Inc.

  • 1,504