Adler-32

Adler -32 is a simple, developed by Mark Adler checksum algorithm. It is used, inter alia, by the zlib to detect (Random transmission ) errors in the compressed data stream. In RFC 1950, the algorithm is described in detail.

The Adler -32 algorithm is simpler and can be calculated faster than the well-known cyclic redundancy check ( cyclic redundancy check), but also offers less security in detecting random bit errors.

Algorithm

The algorithm calculates two sums S1 and S2. S1 is the sum of all the data bytes, and is initialized at the beginning of the algorithm 1, s2 is the sum of all values ​​of S1. Both sums are calculated modulo 65521 ( the largest prime < 216).

Although the algorithm is very simple, here is an example implementation in C is specified:

/ * Sample code for calculating the Adler -32 checksum * /      # include / / for uint8_t/uint32_t    # include / / for size_t    uint32_t adler32 (const void * buf, size_t buflength ) {         const uint8_t * buffer = (const uint8_t *) buf;         uint32_t s1 = 1;       uint32_t s2 = 0;         for ( size_t n = 0; n < buflength, n ) {          S1 = ( S1 buffer [ n]) % 65 521;          s2 = (s2 s1) % 65521;       }         return ( s2 << 16) | s1;    } Note sample implementation

This sample implementation is not optimized for speed, but for clarity and legibility. Such as the pretty slow modulo operation need not be performed on each byte of data, but only when an overflow of the variables s1, s2, or imminent. With a bit width of 32 bits (which is not guaranteed when using int, so uint32_t above, according to C99 ) satisfies an implementation of the modulo operation, all 5552 bytes. 64 bits ( uint64_t ) wide S1 and S2 would be sufficient even performing the modulo operation all 380,368,439 bytes, thereby but no significant increase in speed can be achieved. Details, see the residue class ring.

The high number of jumps decreases in pipelined processors, the effective utilization of the quasi-parallel execution of individual commands. It is therefore advisable to disassemble the individual data in a maximum of 5552 -byte sub-blocks according to which only a modulo operation is performed. These blocks should turn large subunits are broken down into 16 bytes, which are summed up in one iteration. Through this so-called loop unrolling can be used in about 25-30% speed increase on modern processors with sufficiently large data produce.

Weaknesses of Adler -32

An optimal checksum algorithm generates a checksum that is uniformly distributed as possible over their range of values. This is (< 128 bytes ) not present in Adler -32 for short data sequences, as the value of S1 does not overflow.

By simple arithmetic structure of Adler -32 it also leads to many collisions, especially at similar input values. For example, byte n is increased by the input k, Byte k n 1 reduced to 2 x n 2 and Byte k increases to remain both s1 (the sum of all the bytes ), and S2 ( the sum of all intermediate values ​​of s1 ) Unchanged. This applies to any position n in the input and all the positive or negative values ​​of k, provided that the consideration bytes uncrowded. As a result, the 32- bit checksum Adler -32 can not even secure a 24 -bit input reliable.

Only single and double bit errors are detected reliably, by the sums s1 or s2. However Join bursts of three or more bit errors on, as in the example above, a reliable detection is not guaranteed.

For this reason it was replaced with CRC -32, since quite often only short data streams are used and the weakness of Adler -32 is revealed, inter alia, the implementation of the Stream Control Transmission Protocols of the checksum algorithm used Eagle 32.

Also, it is relatively easy to produce by intentional modification of a data stream with the same checksum. Therefore, Adler - 32 does not guarantee the integrity of data. If such security is required, cryptographic hash functions such as SHA must be used.

  • Algorithm
30110
de