bzip2

Template: Infobox file format / Maintenance / default missing template: Infobox file format / Maintenance / missing site

42 5A 68 hex BZH (String)

Bzip2 is a freely available compression program for lossless compression of files, developed by Julian Seward. It is free of any patented algorithms and is distributed under a BSD-like license.

Bzip2 compresses data in a three-step process: First, the input data blocks with the reversible Burrows-Wheeler transform order. The result of the move-to -front is then subjected to transformation. The result of which is then finally subjected to Huffman coding, which performs the actual data compression.

The compression with bzip2 is often more effective, but usually much slower than the compression with gzip or scarce. However, since 2003, there is also the variant pbzip2 that dominates multi-threading and on current multi-core processors is significantly faster. This parallelization is only possible because the algorithm from the input stream bzip2 compressed blocks, each block is compressed independently of the previous blocks. Bzip2 compressed files are identified by the bz2 file extension.. tar files that have been compressed with bzip2, usually have the extension. tar.bz2 or. tbz2. An advantage of such bzip2 compressed tar files is that when read errors or damages all still readable blocks by bzip2recover copy and then unzip leave, while other compression methods can not continue to work after a read error.

Bzip2 is the successor of bzip, which originally used arithmetic coding after Blocksort; patent law has not developed any further, however, bzip.

File Format

A. Bz2 data stream begins with a signature (4 bytes ), followed by zero or more blocks of compressed, then directly followed by an end-of -stream marker, and a CRC (32-bit ) for the original contents of the entire file. The compressed blocks are bit -aligned (no padding).

VarName Bits Description ________________________________________________________________. magic: 2 * 8 ' BZ ' signature / magic number. version: 1 * 8 ' h ' for Bzip2 ( ' H'uffman coding), '0 ' for Bzip1 ( deprecated ). hundred_k_blocksize: 1 * 8 '1 ' .. '9' block-size 100 kB 900 kB - ( uncompressed ). compressed_magic: 6 * 8 '1 AY & SY '-> 0x314159265359 (BCD (pi) ). crc: 4 * 8 checksum for this block. randomized: 1 (! bit) 0 => normal, 1 => randomized ( deprecated ). origPtr: 3 * 8 starting pointer into BWT for after untransform. huffman_used_map: 2 * 8 bitmap for Following ' huffman_used_bitmaps ', of ranges of 16 bytes, present / not present. huffman_used_bitmaps: 0 .32 * 8 bitmap, of symbols used to present / not present (multiples of 16). huffman_groups: 3 2 .6 number of different Huffman tables in use. selectors_used: 15 number of times did the Huffman tables are swapped (each 50 bytes). * selector_list: 1 .6 zero- terminated bit runs ( 0 .62 ) of MTF'ed Huffman table (* selectors_used ). start_huffman_length: 5 0 .20 starting bit length for Huffman deltas * delta_bit_length: 1 .5 * 8 0 = > next symbol, . 1 => alternated length                                       { 1 => decrement length; 0 => increment length} (* ( symbols 2) * groups). contents: 2 ∞ Huffman encoded data stream until end of block. . eos_magic: 6 * 8 \ x17 ' rE8P ' \ x90 -> 0x177245385090 (BCD sqrt (pi) ). crc: 4 * 8 checksum for whole stream. padding: 0 .7 to align whole byte see also

  • Gzip
  • Data compression programs
157155
de