Kolmogorov complexity

The Kolmogorov complexity (after Andrei Nikolaevich Kolmogorov ) is a measure of the structured nature of a string and is given by the length of the shortest program that generates this string. This shortest program is thus a best compression of the string without that information is lost.

If the Kolmogorov complexity of a character string is at least as large as the string itself, the string is then referred to as incompressible, random or unstructured. The closer the Kolmogorov complexity is due to the length of the string, the more ' random ' is the string ( and the more information it contains ).

The principle of Kolmogorov complexity became independent in 1964 by RJ Solomonoff, developed in 1965 by Andrei Kolmogorov and 1969 by Gregory Chaitin, and has references to Shannon's information theory.

The Kolmogorov complexity is sometimes also called Algorithmic complexity or description complexity, but should not be confused with the time or space complexity of algorithms. More precisely, is the name Algorithmic information content, which establishes the connection to the concept of information content to Shannon. A related, but clearly accrued Algorithmic approach is the depth that refers to the effort, which must be operated to produce a specific message or decrypt. The Algorithmic Information Theory by Gregory Chaitin clarified the approach to Kolmogorov in relation to the machine model. Jorma Rissanen describes the Minimum Description Length a similar concept, but based on data compression.

Example

An example for generating a sequence of 1000 zeros may illustrate the compression: The number 1000 can be represented ( in binary form ) by 10 bits. For a given (constant) program to print a null sequence, you can specify the Kolmogorov complexity of a sequence of n zeros as " constant log (n) ":

Program zero sequence ( s)   begin     for i: = 1 to n / / in the example n = 1000      print " 0"   end The above program can also be encoded with a constant number of bits, for example, as machine code or as a simple text file. The coding of the number n requires log (n ) bits. The entire coding thus requires aggregated " constant log (n) " bits, and thus for large n, substantially less than n bits. Therefore, the null sequence is compressible.

The following diagram illustrates the compressibility:

Program zero sequence ( s) 00000000000000000000000000000 0begin00000000000000000000000000000000000000000000 00for i: = 1 to n0000000000000000000000000000000000 000print "0" 00000000000000000000000000000000000000 0end0000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 The program which generates the sequence with zeros 1000, taking no more than 5% of a sequence itself.

Coincidence or right?

However, there is (in this sense) even seemingly random sequences of numbers. For example, there is a short program that generates the decimal expansion of the circle of pi in arbitrary precision. Also This gives a complexity of the form " constant log (n) ", where n is the accuracy of the representation indicates.

Calculation

The Kolmogorov complexity is not computable, it is, however, from above recursively enumerable.

Applications

Today, the Kolmogorov complexity has applications in computer science, biology, and other sciences, consider the structures or information.

48392
de