Hamming-Code

The Hamming code is a developed by Richard Wesley Hamming linear block error-correcting code that is used in digital signal processing and communications technology for secure data transmission or data storage.

In the Hamming code is a class of block codes of different lengths, which are formed by a general education requirement. The peculiarity of this code is to use multiple parity bits. These bits complement each different selected groups from the information carrying user data bits. By a judicious choice of grouping whose mathematical principles are described in the following, not only error detection, but also error correction of the transmitted data bits is possible.

The individual code words of Hamming codes have a Hamming distance of three. This difference of three bit positions, the decoder can detect one or two bit errors in a data block, but only correct one bit error. For two bit errors, the decoder provides a valid but wrong codeword. The extended Hamming code having a Hamming distance of four may be identified by an additional parity bit for up to three bit errors in a data block, but only one correct bit errors. Two bit errors are detected in the extended Hamming code as bad (invalid ) code word is not correctable.

  • 3.1 correction performance
  • 3.2 linearity
  • 3.3 separable Hamming code
  • 3.4 Cyclic Hamming code
  • 5.1 decoding using hard decision

History

In the 1940s, Richard Hamming worked in the company Bell Labs on a computer named Bell Model V, which was equipped with error-prone electromechanical relay with two machine cycles per second. The punch cards used to input data could have errors due to wear in the read operation that had to be corrected to normal office hours by employees of Bell Labs by hand. The normal working hours of Richard Hamming, outside office hours and on weekends, these read errors meant that the computer skipped the faulty punch cards and continued to work on other punch cards.

Hamming, frustrated by this error and the multiple effort, developed in consequence of a special code that the machine could automatically detect and correct read errors from punch cards to a certain extent. In 1950 he published this Hamming code, which finds widespread use in partially expanded forms in the field of channel coding and today.

Structure of the code

In the following, only binary Hamming codes are presented. Binary Hamming codes are based on Paritycodes over a data block of fixed length. The block of data, also referred to as a " data word " or " message word ", comprises n bits, and includes the actual payload. n can take on the Hamming code only specific integer values ​​, which result from the formation of this default code. The bit combinations in the n bit block of data can in principle be chosen arbitrarily, ie there are any bit combinations allowed.

The codeword of the Hamming code is obtained from the data word through the integration of additional control points, the so-called parity bits, won. Here, in each data word of n bits in length, a fixed number of k checkpoints is added. Hence, the codeword results with a length of N = n k only certain bit combinations are for a code word, since the redundant control bodies carried out of the data block derived information possible. This allows both error detection and error correction.

The default for code formation by another equation from N to k is described in the following form:

This means that if, for example, three binary digits for control bits (parity bits) are available, the codeword must necessarily have a length of N = 7. For the data word is then obtained a length of n = 7 - 3 = 4 bits per code word, or in general:

The following table lists all possible Hamming codes of different codeword lengths are shown to the block size of 255 bits:

In order to classify the different length Hamming codes usually following notation is used in the literature: (N, n ) code. The first number indicates the number of message bits (N) of the codeword, the second number is the number of data bits ( n ) bits per codeword. Especially in demonstration examples of simplicity is often encountered because of the ( 7,4) Hamming code. Here is the overhead for real-world applications, ie the ratio of check bits to data bits, too unfavorable, so long as the Hamming code ( 63,57 ) Hamming code can be used.

Sometimes in the classification specifying the distance of the code is not specified in the third position. Due to the fixed Hamming distance, however, is only " ( 63,57 ) Hamming code" written mostly take place " ( 63,57,3 ) Hamming code".

Calculation of the parity locations

The k parity locations ( control bits ) in a code word are calculated by a method as it is used also in the simple parity check bit. As a rule, even parity is chosen by convention ( even parity ) for all control bodies: Is the number of logical -1 is calculated in the respective control point data bit an even number, the respective parity bit logic -0. If the number of logic 1 in the data bit is an odd number, the respective parity bit is set to logic -1, so that. In sum in the data bit and the parity bit, always results in an even number of logic 1 bits

The individual parity bits of a codeword are not formed on all points ( bits) of the data word, but only individual, selected data bits. To construct which data bit locations are included in which control bits can be taken according to one illustrative method. The context of the code word from the data bits and the control bodies is formed to:

P1, p2, ... pk parity bits d1, d2, ... dN -k bits of the data word, and c1, c2, ... cN the bits of to forming the code word, a code word of the thus constructed Hamming code has the following form ( this description can be extended accordingly for larger codeword lengths):

Only those data bits are included in the first parity bit p1, which are further to the right one bit position in the code word, and include a bit as a data width. For the first parity bit is thus the result of codeword positions all the data bits, which are in odd position in the code word:

The symbol is the logical XOR function and also represents the formation rule for the control bits dar. As can also be seen with the help of the above table, come to the bit positions mentioned before in the code word on the right side of the equation only data bits. This applies to all parity bits.

In the second parity bit p2 is the right of p2 standing in the codeword bit c3 factored, then skipped two places in the code word, the next two bits c6 and c7 factored again skipped two places, and so on. Instead of a data bit ie two adjacent data bits are taken and skipped two places in the code word. This results in the formation rule for p2:

The right side of the following code word to be in three positions, the third parity bit P3 factored, there are four locations of the skipped code word then factored four bits skipped four digits, and so on. Thus, groups are used to four bits for the formation of the parity bits. This produces for p3:

The parity bit pi is thus calculated over all points cj of the codeword, in which a logical one is at the i -th digit of the binary coding of the index j. According to this process is continued similarly for the remaining parity bits, until all the parity bits of the selected Hamming code are determined. The determination of the codeword is done in practical applications by the so-called encoder.

In the specific case of the ( 7,4) Hamming code, this results in the following code word table. Among the linkages between the individual parity bits are entered in the respective columns, from which immediately called the check matrix H shown later, also check matrix results for this example. In the columns of the last three lines arrows are registered in those places where then logical '1 ' refer to the check matrix H. This pattern can be determined at each Hamming code with some effort the check matrix. It is always a parity bit per line with an upward arrow ( ↑) drawn in, and the data bits for the determination of the relevant parity bit with a down arrow (↓ ) marks.

The arrangement of parity and data bits here is arbitrary. It may be selected in the code word, without limitation, a different sequence of the bits without changing the property of the Hamming code. This property is exploited in the following systematic or separable Hamming code, where the parity bits are always appended to the end of the data word to form the code word. The separable Hamming code is obtained by the same formation procedure, but with a different arrangement of the rows of the generator matrix G, and thus connected in other check matrix H.

Shortest Hamming code

The shortest Hamming code is the (3,1) Hamming code. Here, a user data bit is assigned to a three -bit code word. With the above general calculation shows that the two " parity " p1 and p2 only in this case correspond directly to the one user data bit. There can be only two valid code words 000 and 111. Invalid code words 001, 010 and 100 are allocated to the code word 000 in the decoding process by the method of majority decision. 110, 101 and 011 of the code word 111 of the (3,1) Hamming code is thus as a special case like a repetition code with a length of 3 The (3,1) Hamming code is also the only Hamming code which is specified only by the specification (3.1 ) is clearly in the structure of the code word.

Properties

Correction performance

The Hamming code has, regardless of the selected block size is always a distance of three to. This means that adjacent code words differ by always three bits. If an error occurs at a point of a codeword on, this will be recognized as invalid and can clearly associated with the correct code word, the transmission errors are therefore corrected. However, if two errors occur in a code word on this assignment does not work anymore - the correction of the decoder maps the received codeword falsely to another. This is referred to as a " false correct ". Another form of failure occurs when three transmission errors: Here, the decoder identifies the faulty code word to be valid.

Hamming codes can therefore only correct one bit error per data word correctly. Up to two bit errors can be detected. Because of its ability to assign all received codewords a valid codeword of the Hamming code is a perfect code. Most other binary codes - such as the extended Hamming code - are not perfect. These codes can be due to transmission errors codewords occur, although they may recognize as false the decoder, but can not assign valid word ( Dekodierversagen ).

Linearity

Hamming codes are basically linear codes. At this - also referred to as a binary group codes - codes each modulo -2 addition ( XOR ) of two code words results back to a valid codeword. One of the conditions of a linear code counts the existence of a neutral element. In the case of a Hamming code, this means that the null word - whose points are all logical '0 ' - must be valid. The so-called " weight code " thus corresponds with the Hamming code Hamming distance of three.

Another general property of group codes is that, the individual valid code words c from a generator matrix G and the data words d produce according to the following form:

From this equation with the formation rule shown in the previous section, and the rules for computing matrices for a non-separable ( 7,4) Hamming code, the following generator matrix G:

The first two rows of the matrix form, the first two parity bits P1 and P2 of the code word. The ones of the lines indicate here which data bit locations are included in the respective parity bit. The third line, as are all subsequent lines with only a one per line, form the data bits dn from the code word. The fourth line is p3 the third parity bit.

From the generator matrix, the control matrix H can be directly derived, which is used by the decoder to identify erroneous bit positions by means of matrix multiplication. The test matrix must be chosen such that it is orthogonal to all valid code words c:

For the above generator matrix G, the following check matrix H can be determined:

The content of the matrix in this case can, for example, presented in the previous section on the tabular method for determining the parity bits are obtained from the generator matrix.

If a single bit error, so apply to the resulting, invalid code word:

A value of this equation, the faulty bit location can be uniquely determined and corrected by inverting a syndrome table.

Separable Hamming code

Due to the linearity of any row of the generator matrix G can be interchanged without affecting the code properties. The particular form of the generator matrix only needs to be coordinated between the encoder and decoder. A separable code occurs when all data bits dn and then all parity bits pn are arranged in the code word first. Due to the separability of the encoder and decoder can be implemented with circuitry in electronic digital circuits such as ASICs or FPGAs with less memory cost and with less latency. Separable codes are also referred to as a " systematic code ."

The above generator matrix G can be transformed to the following generator matrix G ' by changing the line that the ( 7,4) Hamming code is a separable code. A possible generator matrix of the separable ( 7,4) Hamming codes is:

The associated control matrix H ' can be more easily determined, because separable block code applies with modulo-2 operations:

In order to determine H ' to:

The first four columns of the check matrix H ' consist of the last three rows of the generator matrix G'. The right part of the check matrix is filled with the unit matrix.

Thus it is shown that only the value ( 7,4) Hamming code for a specific implementation does not clearly describes the specific encoding process and decoding process. This is only ensured by specifying the generator matrix, or cyclic Hamming codes by the generator polynomial.

Cyclic Hamming code

In the practice of playing cyclic codes, in particular separable cyclic Hamming codes, an important role. With these, the calculation of the individual check bits in the encoder and the decoding in the decoder with minimal memory overhead in the form of linear feedback shift registers ( LFSR) can be realized. Cyclic codes are linear codes, where in addition, the requirement is that a rotation or cyclic shift of a code word in turn has lead to a valid codeword.

The cyclic Hamming code can generally be considered equivalent to in the description as a subset of the BCH code ( Bose - Chaudhuri - Hocquenghem code). BCH codes are a very large group of cyclic block codes, which can be varied more than the Hamming code in their parameters and configuration.

The production of cyclic Hamming codes is carried out depending on the block length by primitive generator polynomials of appropriate degrees. The generator can be directly mapped in the LFSR to calculate the parity bits.

Exemplary standard generator polynomials are given in the following table. It can be used in concrete implementations as well as other generator polynomials are selected without changing the properties of the Hamming codes, the selected polynomial is primitive and has been agreed between the encoder and decoder:

Practical realization of a Hamming encoder

Cyclic separable Hamming codes can be implemented in digital circuitry simple electrical circuits. In this diagram is to illustrate a ( 15,11 ) Hamming encoder with generator polynomial z4 z 1 shown.

In the illustration, the data bits d are as a serial data stream in the middle of the top and the right output is read, the code word C in series. The switches are located initially in position A: In order for the data bits in the code word first output directly and simultaneously pushed into the LFSR. All bits of a data word is read, change the two switches in position B: the contents of the LFSR will be bit output corresponding to the parity bit p. Are all inspection issued, the process repeats. To simplify the necessary clock lines and synchronization circuits are not shown.

The Hamming decoder designed similar: the received serial bit stream of the code words is shifted into a corresponding LFSR and simultaneously pushed in a shift register chain for the purpose of separate latency adjustment. The content of the LFSR is used at the decoder after the complete reception of a code word as an address pointer to a syndrome memory, which is usually implemented as a fixed ROM table in the circuit. The data output of the syndrome accumulator acts directly on the serial data stream of data bits and corrected if necessary erroneously detected data bit by inverting.

Decoding

In decoding, various methods can be used, which differ in the complexity of the decoder and the decoder performance. A key method is based on the above- identified syndrome table, which gives information about which point in the code word is wrong. This is a relatively simple method for received or read binary symbols. However, no general method is known by a linear block code of arbitrary code word length would be deterministic decodable in polynomial time. In a (N, n) Hamming code decoding cost of the code word length N and is dependent increases exponentially. By using the syndrome in the decoding can be the number of possible fault combinations of 2n 2n reduced. In complexity theory, the class time that decision problems is called NP- hard.

Another way to improve the decoding performance, consists in the fact that the individual code words generally does not get as binary, rather than multi-level signals in real communications systems, the decoder. The received analog signals are quantized by an upstream analog to digital converter at first. The resultant gradation of the signal between logic 0 and logic 1 are perceived by the decoder as the codeword probabilities and designed iteratively using this. This procedure is usually referred to in the English-language specialist literature as soft decision, and causes a higher code gain.

The counterpart thereto is the so-called hard decision which an extreme case, the soft decision may be considered as. In this case, the analog received signal prior to decoding by 1 - bit wide analog to digital converter, a " comparator ", as shown, a digital input signal for the code words. This is already determined before decoding whether a particular bit of the received codeword logical '1 ' or logic '0 ' is.

Decoding using hard decision

In this case, the received codewords are already available as digital effects, which is why the decoding process is reduced in a one-step process on the evaluation of the syndrome table. This method is mostly used when the decoder should be as simple as possible and no " concatenated codes ", ie, consisting of combinations of different Hamming codes are used.

In the above- introduced by means representation of parity check matrix has already been explained, that the product of the received code word C, and the check matrix has a value other than 0 if an error takes:

By appropriate arrangement of the parity locations, and thus due to the shape of the control matrix, the value of this equation as a syndrome can be used directly to correct the relevant bit position in the simplest case. When this equation at ( 7,4) Hamming code the error code of 1, the first bit of the received codeword is exactly wrong. 2 at the value of the equation, the second bit, and so on. The error can be corrected by negation of the relevant bit position in the code word. In the error-free case above equation returns the value 0, and no bit position is corrected.

This simple correspondence between syndrome value to faulty bit location is at a Hamming code only the case when the individual parity bits are exactly at the positions in the codeword which represent powers of two. This is the case for the generator matrix G outset. This eliminates a syndrome table ( ROM table ), which first converts the current value of the equation by the check matrix, and code word to a particular bit position. This simplification for the decoding is also the reason why Hamming codes are usually performed in the examples in the form of the generator matrix G shown above.

For decoding the presented above separable ( 7,4) Hamming code with its check matrix H ', however, is to determine the faulty point in the code word, a conversion of the value of the Prüfgleichung necessary. In the error-free case provides the Prüfgleichung, as with all Hamming codes, the value 0 in case of error it returns a value other than 0, which correspond to the non- faulty bit position in the code word to meet needs. The reaction to the faulty bit position, by means of a ROM memory take place whose address is given the value of the test matrix, and specify the data outputs, which bit position must be corrected by inverting. In the case of the above, separable ( 7,4) Hamming code of the ROM memory must have the following data content:

Extended Hamming code

Since the Hamming code can detect and correct only one bit error per data word and run two bit errors per data word at the decoder to a wrong codeword, there is a desire to improve these properties. This code is referred to as "extended Hamming code " (English extended Hamming code). Purpose, in the Hamming code another parity bit is added, which will incorporate all the binary digits of the non-extended Hamming code. Thus, for example, from the ( 7,4) Hamming code, an extended (8,4 ) Hamming code.

The extension of a general block codes with an additional control point only makes sense if the " Code Weight " is odd, since additional information in this additional control bit is only available. This is always true for the Hamming code having a code weight of 3. Thus, the Hamming distance in the extended Hamming code is increased from three to four, and the extended Hamming code can detect and correct the following errors per code word:

For the decoder of extended Hamming codes, which operates only by means of hard-decision, can ensure that the following truth table set up, you can opt for their inputs in the form of the syndrome vector and the additional parity check decoder if no error, a correctable error or an uncorrectable error occurs:

The extended Hamming code is not a perfect code, there is no longer any invalid code words can be unambiguously assigned to valid code words. What happens in the cases with detected but uncorrectable data errors, further processing levels have to decide according to the Hamming decoder. Furthermore, in three or more bit errors per code word may be a " decoding error" occur. That is, these multiple errors are either not recognized or assigned unsent valid code words. This is in particular the case Hamming codes with long code words, because this behavior is not changed by choosing the codeword length.

Applies the extended Hamming code, for example, a so-called "inner block code " in Turbo Product Codes as they are used in wireless 802.16 wireless networks for data transmission according to the IEEE standard in the context of WiMAX on the radio interface.

Code shortening

Both of the Hamming code and the extended Hamming code can be shortened to the length of their code words in order to obtain codewords with a specific application, a fixed length. This is known as code shortening. All Hamming codes have, as shown, only relatively few selectable codeword lengths in rough increments to 2k -1, where k is an integer greater than 1 must be selected. Intermediate code word lengths are not possible when the Hamming code.

By the method of code shortening codeword lengths between each of those rough stages are selectable, but this advantage is bought depending on the process either by a worse ratio of data bit ( data) to the number of control points in the code word, or it is and by the process of the minimum distance of the code therefore its correction power is reduced.

Code shortening following procedure can be used in principle for all codes:

Practical applications of condensed and extended Hamming codes are found for example in the correction of simple memory errors and the reliable detection of dual memory errors per address in DRAM. These cost- per-bit memory need only a small capacitor for storing data, and it can be relatively easy to come by interference effects to bit errors. Commercial memory modules have per memory address a data bus width of typically 36 bits or 72 bits on - both are values ​​that can not be reached directly by a corresponding code word lengths of the Hamming codes.

By shortening code by the first method can be constructed with exact matching codeword length is relatively easy, a shortened Hamming code. In the Xilinx application notes for error correction using Hamming codes from an extended Hamming code with parameters ( 128.120 ) is assumed as a base. From a shortened Hamming code ( 72,64 ) is formed, the codeword length corresponds exactly to the data bus of the DRAM memory module and can store 64 user bits per address. In this case, all the parity bits of the ( 128.120 ) Hamming codes are included in the truncated to 72 points codeword. The data bit positions 65-120 are always set to logic -0 and are not stored.

Other number systems

In the binary Hamming code presented in the article, there are only two possible states per site of the data word or code word ( exactly means " binary "). In number theory, this situation is by means of Galois fields of characteristic two, abbreviated as GF ( 2), and expressed. Special feature of all binary error- correcting codes is that already the determination of the fault to the error correction sufficient: Since only two possible states per site exist, an error with the determined position is always corrected by inversion (0 ↔ 1) the body concerned.

In addition to the binary Hamming code, there is also generalize to other, higher speed systems such as the ternary Hamming code in GF ( 3). The ternary Hamming code has three states per site at: {0, 1, 2}. For error correction for all non - binary code, in addition to the localization of the position error, and additional information, to which the other means a certain place has to be changed, necessary.

Generally can also Hamming code on Galois fields GF (q), where q is a prime power needs to be, i.e. a prime number and.

372383
de