GB2312 is a character set (English Coded Character Set) for simplified Chinese character, which was introduced in 1980. It comprises a total of 7,445 characters, of which 6,763 Chinese characters.
All characters are arranged in a 94 × 94 matrix, thus a maximum of 8836 characters are allowed. This system is also used by JIS X 0208 and X 1001 KS.
The first region (row 1-9 ) encodes punctuation as well as the Greek alphabet, Cyrillic, Japanese Kana, Zhuyin and Pinyin letters. The other two areas contain Chinese characters: In line 16-55 Chinese characters are after the Pinyin transliteration sort, the rows 56-87 contain Chinese characters after sorting in the Kangxi Dictionary.
From the font itself, the coding (English Character Encoding Scheme ) is to be distinguished. GB2312 is normally used in the form of EUC- CN. The two character sets are US-ASCII (as a 1 -byte characters) and GB2312 combined (as 2 -byte characters). To distinguish it from the ASCII character to the row and column numbers of the GB2312 character each 160 ( 0xA0 ) is added so that bytes occur in the area 0xA1 to 0xFF. The first byte corresponds to the line number, the second byte of the column number. In the e-mail traffic, the 7- bit encoding HZ was common.
1995 has been extended through the specification of GB2312 GBK, but never became the official norm and thus got no UK number. However, using Windows it met with wide distribution. 2000 was officially replaced GB2312 GB18030 but is still commonly used.
On Windows GB2312 is available as code page 20936 in the EUC -CN encoding, if the extension option "Install files for East Asian languages " is installed. In some places, however, the code page 936 is incorrectly referred to as GB2312 under Windows. In reality, code page 936 is an implementation of GBK. In the " File Conversion" of Word 2003 code page 936 as " Chinese Simplified ( GB2312 ) " and Code Page 20936 as " Chinese Simplified ( GB2312 -80) " for selection.