8-bit clean

8- bit clean is a system which takes account of and processes all eight bits of a byte correctly.

Early code pages such as ASCII encode the characters on 7 bits, while the eighth bit can be used for parity checks or other purposes.

Later code pages such as CP437 or CP850, as well as the ISO 8859 series and UTF- 8 are based on ASCII. By setting the eighth bit to 0 to convert an ASCII character in each of these code pages, while the 1 to a character with a different meaning leads ( umlaut, graphic symbol, part of a multibyte sequence, etc. ).

Conversely, text, are written in code pages, in which the eighth bit is defined, and binary data is first converted by a 7 -bit system prior to processing. Otherwise, would special characters, where the eighth bit is set, misinterpreted and distorted binary data. This procedure is commonly used in e-mail systems (SMTP, MIME, uuencode ).

It has become common to set in ASCII-encoded data, the eighth bit from the outset to zero, that is, to ensure that the data is 8- bit clean. An explicit conversion to the specified code page is unnecessary. The loss of parity information is published today critical because error-prone data transfers are now protected by packet checksums.

Since the 1990s, popular applications and operating systems are 8- bit clean, while SMTP is still processing 7 -bit data for backward compatibility.

  • Encoding
15555
de