Myhill–Nerode theorem

The Myhill - Nerode are in the field of theoretical computer science formal languages ​​to a necessary and sufficient condition that a formal language is regular. He was introduced and proved in 1957/1958 by John Myhill and Nerode Anil.

Colloquially speaking, the set is mainly used to find out whether a formal language is so " benign" or " simple one " that a computer with constant memory (ie, with a finite limited memory whose size does not depend on the input ) can automatically detect, whether a string is a word of the language or not.

Set

Note: The following technical terms are explained in the article Formal language.

Given a formal language over the alphabet and the associated Nerode relation ~. Then:

There exists if a deterministic finite automaton that accepts, if the index of the associated Nerode relation is finite.

Formal:

With a deterministic finite automaton, and the language, which he accepted.

Application

The existence of a deterministic finite automaton that accepts, is a necessary and sufficient condition that a regular language is. The logical equivalence of the sentence thus makes it possible to both show that a formal language is regular, as well as to show that it is not. Since this is the most important application of the theorem of Myhill - Nerode, he is also often read like this:

The language is regular if the index of the associated Nerode relation is finite.

Next it can be concluded that the number of states of a minimal deterministic finite automaton that accepts the index of the associated Nerode relation corresponds.

Examples

Finite languages ​​are regular

The language over the alphabet contains a finite number of words. ( All words from have finite length. ) That is, there exist natural numbers and, such that:

  • .

Since each word as many prefixes exist as it contains letters, and the empty word counts as a prefix, the language has a maximum prefixes and as many equivalence classes. Thus:

That is, the number of equivalence classes is finite, and it follows from the Myhill - Nerode that the language is regular. So you can say: Every language that finally contains many words is regular.

The language L = { a, aa, aaa, ...} is regular

The language over the alphabet is defined by:

This results in exactly one equivalence class with respect to the Nerode relation, namely himself:

That is, all prefixes of the language can be with the same suffixes add to words from. The index of the Nerode relation is finite:

From the set of Myhill - Nerode finally follows that the language is regular.

The language L: = {ab, aabb, aaabbb, ...} is not regular

The language over the alphabet is defined by:

This results in particular the following equivalence classes with respect to the Nerode relation ( any prefix of a word of this language can be only a suffix to complete to ):

These equivalence classes are pairwise different, ie we have:

It follows that the number of equivalence classes is an infinite and - since the number of all equivalence classes of again is greater - so that the index of Nerode relation with respect to infinity. From the set of Myhill - Nerode finally follows that the language is not regular.

Remark

It is not necessary to completely clarify the class structure of a language associated equivalence relation, to show the non- regularity of the language. Otherwise, another equivalence classes should be set up to meet the requirement of equivalence relations justice, a certain basic amount (in this case, ie all words over the input alphabet ) fully divide into disjoint equivalence classes.

The suffixes

Principle can be used on the input alphabet as a suffix of each word, eg etc. Here only each single suffix is specified for each equivalence class, for which the elements of the respective class, all produced in this way include words in its attachment to the language. For any other suffix all resulting words would not belong to the language. This is the basis Nerode relation.

710518
de