ISO/IEC 2022

ISO / IEC 2022, Information technology - Character set structure and extension techniques (English Information Technology - Character code structure and extension techniques ) is an ISO standard that defines a technique for encoding multiple character sets and languages ​​that can not be encoded in 7 bits, .

The character set should solve the problem of different mutually incompatible character encodings, as well as allow the encoding of East Asian writing systems. A coded in ISO 2022 String can be easily transported through 7- bit channels, which allows the use of the character set in the mail and Usenet traffic. Using mostly three or four -byte escape sequences can be switched between multiple character sets. Pro escape sequence can, depending on their definition, either 94, 8836 ( in a 94 × 94 matrix) or 830 584 are encoded ( in a three-dimensional 94 × 94 × 94 matrix) characters.

However, ISO / IEC 2022 could only be enforced by the East Asian mail transport for Western languages ​​no version has been released. Instead, Unicode was developed to accomplish this task.

There are three versions of ISO / IEC 2022, the three East Asian fonts, ISO -2022 -JP, ISO -2022 -KR and ISO -2022 -CN.

ISO -2022 -JP

ISO -2022 -JP encoded Japanese font. It is frequently used in e-mail traffic, otherwise it is rather resorted to Shift_JIS or EUC -JP.

The original version is described in RFC 1468 and contains the following four escape sequences:

  • ESC ( B switches to ASCII (1 byte)
  • ESC ( J switches to JIS- Roman ( 1 byte)
  • ESC $ @ switches to JIS X 0208-1978 (2- byte )
  • ESC $ B switches to JIS X 0208-1983 (2- byte )

ISO -2022 -JP -1 is described in RFC 2237 and adds another escape sequence:

  • ESC $ ( D switches to JIS X 0212-1990 (2- byte )

ISO -2022 -JP -2 is described in RFC 1554 and adds additional escape sequences added to support additional languages. It extends ISO -2022 -JP -1 to the following escape sequences:

  • ESC $ A switches to GB2312 - 1980 (2 - byte )
  • ESC $ (C switched to KS C 5601-1987 (2- byte )
  • ESC. A switches to ISO 8859-1 ( one byte)
  • ESC. F switched to ISO 8859-7 (1 byte)

ISO -2022 -JP- 3 extends the original version to the following escape sequences:

ISO -2022 -JP- 2004 expanded ISO -2022 -JP -3 to the following escape sequence:

ISO -2022 -KR

ISO -2022 -KR encoding the Korean script and is used in addition to EUC -KR on Korean websites. It contains only one escape sequence:

  • ESC $ (C switched to KS C 5601-1987 (2- byte )

ISO -2022 -CN

ISO -2022 -CN encoding the Chinese writing ( both short - and long- characters) and is described in RFC 1922. It is almost never used, EUC -CN and Big5 and HZ mail traffic are encountered much more frequently. The encoding contains the following escape sequences:

  • ESC $ ( A switches to GB2312 - 1980 (2 - byte )

ISO -2022 -CN -EXT extended the original character set to the following escape sequences:

  • ESC $ (E switched to ISO -IR -165 (2- byte )
419264
de