Uuencoding

UUencode was the first widely used program that made ​​it possible binaries (ie, for example, images or programs ) to convert so that they consist only of " printable ASCII characters " and thus could easily be sent by e -mail, in which only ASCII characters are allowed.

History

The UU stands for the roots in UNIX. The UU in UUencode and -decode is just like the UU UUCP UNIX to UNIX copy protocol. So the transfer of a UNIX computer to another UNIX computer.

The principle is similar to common today for e -mail attachments, Base64 method: Three bytes of the binary ( = 24 bits) are divided into four 6-bit and 6 -bit values ​​are printable ASCII characters assigned. First versions of UUencode just used it the ASCII characters with values ​​from 32 to 95

Since the ASCII character with value 32 is but the space, and this e-mail transmission is often not curative, instead, the ASCII with the value 96 ("` " ) was used.

File Format

UUencode uses a special format for the encoded file:

When this mode, the file permissions, which are common on Unix, as 3 - or 4-digit octal number written. The file name is the name of the original file, without any directory.

Each data line begins with a 1 -byte length value that specifies how many bytes have been the original coded in this line. The length is a number between 1 and 63, and is then also uu - encoded, so as a sign of "!" to "_". Are usual 45 bytes (ie, the value " M") which are encoded in 60 characters.

Coding method

Three bytes of source data are encoded by uuencode in four bytes. The data are stored in the file uuencodierten in the lower six bits of the byte, the upper bits are set by coding:

Uncoded bit stream coded bitstream ↔ aaaaaaaa cccccccc bbbbbbbb ↔ 0kaaaaaa 0kaabbbb 0kbbbbcc 0kcccccc For coding, the new groups of six " 00eeeeee " first XORed with 32. If the resulting value ≤ 32, the bit k is set.

Uncoded (XOR 32) (set k? ) encoded = → yes → = [ 1.31 ] = [ 00000001.00011111 ] → [ 00100001.00111111 ] No → [ 00100001.00111111 ] = [ 33,63 ] [ 32,63 ] = [ 00100000.00111111 ] → [ 00000000.00011111 ] yes → [ 01000000.01011111 ] = [ 64,95 ] Or more simply: For 0 is the result of 96 will be added for all other needs 32.

To indicate the end of the file must always be a " blank line " to be coded, only the length byte 0 (encoded "` " ) contains. It ends with a line containing the keyword end.

The decoding ( uudecode ) of the data works in reverse, only the k remove if necessary, subtract 32, then summarize the four bytes to 24 bits and output the three bytes.

Example

A paragraph of text from above serve as input:

History The UU stands for the roots in UNIX. The UU in UUencode and -decode is as well as the UU UUCP UNIX to UNIX copy protocol. So the transmission from a UNIX computer to another UNIX computer. The UUencodierung make of it:

Begin 644 uuencode test.txt M1V5S8VAI8VAT90T * # 0I $ 87, 554 @ @ < W1E: '0 @ 9OQR ( 1I92 & 7 = 7! ) Z96QN ( & EN M (% @ N 5:25 ($ 1A < R 552 I; 5565N8V B ] D92 U; F0 @ 61 E8V ] D92 = S & 5H = "` -! ! M " F5B96YS, R W: 64 @ 9 &% S ( % 55 (&) E: 2 556- P ( &, \ < B 53DE8 ('1 O ( 5.25 % @ @! M8V ] P> 2 P

XXencoded works exactly like UUencode, but uses only letters and numbers and the two special characters plus ( ) and minus (-). This is the risk that some characters are damaged in the text file by automatic character set conversions (eg from ASCII to EBCDIC) during transmission irreparable, minimize.

In addition, there is the possibility optional mitzuschicken a list of all characters used in some xxencode versions. If this list is also modified by incorrect character set conversions, the receiver can detect this and still decode the file correctly, as long as the modifications are reversible clearly.

Related Topics

  • 7plus - employed in amateur radio more efficient and also fail-safe encoding method
  • Kermit - protocol that also reflects binary to ASCII characters.
  • To transfer encoding to MIME, which is used in e- mails binaries - Base64.
796316
de