ISO/IEC 2022
ISO / IEC 2022, Information technology - Character set structure and extension techniques (English Information Technology - Character code structure and extension techniques ) is an ISO standard that defines a technique for encoding multiple character sets and languages that can not be encoded in 7 bits, .
The character set should solve the problem of different mutually incompatible character encodings, as well as allow the encoding of East Asian writing systems. A coded in ISO 2022 String can be easily transported through 7- bit channels, which allows the use of the character set in the mail and Usenet traffic. Using mostly three or four -byte escape sequences can be switched between multiple character sets. Pro escape sequence can, depending on their definition, either 94, 8836 ( in a 94 × 94 matrix) or 830 584 are encoded ( in a three-dimensional 94 × 94 × 94 matrix) characters.
However, ISO / IEC 2022 could only be enforced by the East Asian mail transport for Western languages no version has been released. Instead, Unicode was developed to accomplish this task.
There are three versions of ISO / IEC 2022, the three East Asian fonts, ISO -2022 -JP, ISO -2022 -KR and ISO -2022 -CN.
ISO -2022 -JP
ISO -2022 -JP encoded Japanese font. It is frequently used in e-mail traffic, otherwise it is rather resorted to Shift_JIS or EUC -JP.
The original version is described in RFC 1468 and contains the following four escape sequences:
- ESC ( B switches to ASCII (1 byte)
- ESC ( J switches to JIS- Roman ( 1 byte)
- ESC $ @ switches to JIS X 0208-1978 (2- byte )
- ESC $ B switches to JIS X 0208-1983 (2- byte )
ISO -2022 -JP -1 is described in RFC 2237 and adds another escape sequence:
- ESC $ ( D switches to JIS X 0212-1990 (2- byte )
ISO -2022 -JP -2 is described in RFC 1554 and adds additional escape sequences added to support additional languages. It extends ISO -2022 -JP -1 to the following escape sequences:
- ESC $ A switches to GB2312 - 1980 (2 - byte )
- ESC $ (C switched to KS C 5601-1987 (2- byte )
- ESC. A switches to ISO 8859-1 ( one byte)
- ESC. F switched to ISO 8859-7 (1 byte)
ISO -2022 -JP- 3 extends the original version to the following escape sequences:
- ESC (I switched to JIS X 0201 (1 byte)
ISO -2022 -JP- 2004 expanded ISO -2022 -JP -3 to the following escape sequence:
ISO -2022 -KR
ISO -2022 -KR encoding the Korean script and is used in addition to EUC -KR on Korean websites. It contains only one escape sequence:
- ESC $ (C switched to KS C 5601-1987 (2- byte )
ISO -2022 -CN
ISO -2022 -CN encoding the Chinese writing ( both short - and long- characters) and is described in RFC 1922. It is almost never used, EUC -CN and Big5 and HZ mail traffic are encountered much more frequently. The encoding contains the following escape sequences:
- ESC $ ( A switches to GB2312 - 1980 (2 - byte )
ISO -2022 -CN -EXT extended the original character set to the following escape sequences:
- ESC $ (E switched to ISO -IR -165 (2- byte )