Holland's schema theorem

The schema set by John H. Holland deals with the convergence behavior of genetic algorithms. The theorem proves that prevail individuals with above-average fitness are more likely.

Derivation

Thus, the schema set considering the genome of an individual, usually in a bit string, the encoding values ​​.

First, the conceptual scheme must be explained:

A schema is a bit pattern that represents a set of bit strings. A schema consists of the characters 0 or 1 #. The # character acts as a placeholder for a 0 or 1

For example, denotes the schema # 101 # the following set of bit strings:

01010, 01011, 11011, 11010th

The schema set will now calculate the expected value that a certain schema H "survived" from one generation to the next. To this end, the three key steps of a genetic algorithm are investigated:

  • Selection
  • Crossover ( recombination)
  • Mutation

There will be a population of N binary genomes of length l at an instant t consisting considered. The used fitness function f is normalized and defined for all bit strings of length l.

As part of the derivation of the following notations are used:

Number of the genomes at the time t, which includes the scheme of H.

The diameter of Scheme H is defined as the length of the shortest       Substring that does not contain any solid bits of the schema.       For example, d ( # 0101 # # 1 # # ) = 7

Represents the number of fixed bits in HZB b ( # # 0 # # 11 # ) = 3 selection

Since the fitness is normalized, is true for the probability p that a certain parent chain is selected from a population:

Be now without limitation, all those bit strings of the population at time t that contain the schema H The fitness of schema H is then defined by: .

The probability that a chain is selected, which contains H, thus:

The probability that two parents chains selected containing both H, are applicable.

Recombination (crossover )

With the 1 -point recombination, a dividing point between the bit positions 1 and l-1 is initially selected. If both parents H included, like daughter chain includes this scheme. If only one parent chain the schema, so H is passed on average, in half of the cases, if it is not cut at the crossover itself. The probability that it will not cut, is:

This is true for the probability W that the crossover the schema H is passed:

If the pattern H is cut through the crossover, there is a possibility that the missing portion is present at the corresponding position in the other parent chain. Therefore, the inequality stems.

If two -point crossover is performed, which has to only impact the probability that the pattern is cut through, rises.

Mutation

Let p be the probability of mutation, that is, each bit of the new chain is negated with probability p. This means that the pattern H is maintained with fixed bits with the probability.

If this effect is taken into account, we obtain for the probability that a chain generated by the operators crossover and mutation contains the schema H:

With.

Conclusion

Thus, if a total of N new chains generated, then for the expectation value of the number of chains containing the schema H at the time:

The last two formulas that schemes prevail with above-average fitness and small diameter with high probability. However, the reproduction probability increases with the frequency of a schema. That is, an average scheme can prevail within a population, if it happens often enough. This effect is called genetic drift.

Furthermore, the factor deserves attention. A high mutation rate of p causes an enhanced destruction successful pattern. On the other hand, a certain frequency of mutations is necessary to search through the search space as comprehensively as possible. By adjustment of P that is the activity of the search algorithm can be controlled.

  • Evolutionary algorithm
712786
de