Hidden Markov Model

The Hidden Markov Model (English, HMM) is a stochastic model, in which a system by a Markov chain - is modeled with unobserved states - named after the Russian mathematician AA Markov. An HMM can be considered as the simplest special case of a dynamic Bayesian network.

Modeling a Markov chain is that the system from a state to another in a random manner, wherein the transition probabilities depend only on the current state, respectively, but not in front of the occupied states. It is also assumed that the transition probabilities are constant over time. Given an HMM, but not these states are observed even from the outside; they are hidden (german hidden). Instead, each of these inner states observable output symbols, so-called emissions associated with that occur depending on the state with certain probabilities. The task consists mostly come from the observed sequence of emissions to probabilistic statements about the hidden states.

Important application areas include speech recognition, computational linguistics and computational biology, but that also include spam filters, gesture recognition in human- machine communication (robotics), handwriting recognition and psychology.

5.1 Determining the Sample Size
5.2 implementation
5.3 Filter
5.4 prediction / forecast
5.5 smoothing
5.6 decoding
5.7 learning problem
5.8 Interpretation Problem

Markov approach

Given two discrete-time random processes and of which only the last is observable. Through him conclusions to be drawn on the course of the first trial; For this, a mathematical model is required, which sets the two processes related to each other.

The approach described here is characterized by the following two assumptions:

The current value of the first process depends solely on its last value from:

The current value of the second process depends entirely on the current value of the first:

If the two processes now have each a finite set of values , so the model can be obtained in this way understood as probabilistic automaton, more accurate than Markov chain. It is said also is a Markov process. Based on the language used in theoretical computer science - in particular the theory of automata and formal language theory - the names of the values of the first process and the states of the second emissions or outputs.

Definition

A Hidden Markov Model is a 5 - tuple:

The set of all states, these are the possible values of the random variables
The alphabet of possible observations - the emissions from the
The transformation matrix between the states are in each case the likelihood that the state is changed to the state
The observation matrix, which indicate the probability of making the observation in the state
The initial distribution, the probability that the initial state is

An HMM hot stationary (or time-invariant ) if not change the transition and emission probabilities over time. This assumption is often useful because the modeled laws of nature are constant.

Illustration

The picture shows the general architecture of an instantiated HMM. Each oval is the representation of a random variable, or which can take on any value on or off. The first random variable is the hidden state at the time of the HMM, the second is the observation at this time. The arrows in the trellis diagram represents a conditional dependency.

In the diagram you can see that the state of the hidden variable depends only on the state of the variable, previous values have no further influence. Therefore, the model is a first order Markov model. If higher orders are needed, these can always be returned through the insertion of new hidden states at the first order. The value of depends on entirely on.

Example

Prisoner in the dungeon

A prisoner in the dungeon dungeon wants to find out the current weather. He knows that on a sunny day at 70 % followed by a rainy day and that on a rainy day to 50 % followed by a sunny day. Does he know also that the shoes of the guards, in sunny weather, but only 60% are filthy when it rains 90% dirty, he may, by observing the guards shoes draw conclusions about the weather over ( that is, he, the probability for rainy weather sunny weather estimate ).

DNA sequence: detect CpG islands

For the investigation of DNA sequences using bioinformatic methods, the HMM can be used. For example, as CpG islands can be traced in a DNA sequence. In this case, the DNA sequence is the observation whose characters form the output alphabet. In the simplest case, the HMM has two hidden states, namely CpG - island and non- CpG island. These two states differ in their distribution of expenditure, so that the state CpG island are more likely to sign and output.

Speech recognition

In automatic speech recognition with HMM, the spoken sounds are perceived as hidden states and the actually audible tones as emission.

Problems

In the context of HMMs are several fundamental problems.

Determining the Sample Size

Where are the observable emissions. It is to clarify that allow for model properties, in particular, what orthogonal dimensionality of the circuit on which are not directly observable states and allow for a meaningful predictability simultaneously. In particular, to decide which period may be required for the model calculations to obtain the usefulness of the estimates.

Implementation

The calculation of the estimated values of the unobservable states from the observable output sequences need to consider the achievable numerical accuracy. Further criteria for the classification of statistical significance must be implemented. When using an HMM for a given feature vector, the significance determines the probability of a correct or wrong model hypothesis and their information content ( entropy, conditional entropy ) or their information quality.

Filter

Given an HMM and an observation sequence of length. Sought, the probability that the current hidden state at the last time is straight. An efficient method of solving the problem is to restrict forward algorithm.

Prediction / forecast

Given is again an HMM and the observation sequence and a. Wanted is probability, ie the probability that the HMM is at the time in the state, if the expenditure was observed. Prediction is to some extent repeated filtering with no new observations and can also be easily calculated with the Forward algorithm.

Smooth

Were again, and a given. Sought is the probability, which is the probability that the model was earlier in a particular state, on condition that it has been observed. Using the Backward algorithm, this probability can be computed efficiently.

Decoding

Be again and where. It is the most likely state sequence can be determined from that could have generated a given output sequence. This problem can be efficiently solved using the Viterbi algorithm.

Learning problem

Consider only the output sequence. There are the parameters of a HMM are determined which generate the output sequence of the most likely. This is solved using the Baum-Welch algorithm.

Problem of interpretation

Given only the possible outputs. There are the conditions in the model system, and the corresponding effects are identified in a real system, which describes the state amount of the model. These anticipated the importance of each emissions must be determined.

Areas of application

HMMs are used frequently in pattern recognition in the processing of sequential data, such as physical measurements, recorded voice signals or protein sequences. The models are constructed such that the hidden states of semantic units correspond to (e.g., phonemes, in the speech recognition ), which is considered to recognize it in the sequential data (for example, short-time spectra of the speech signal ). Another application is to find for a given HMM by a search in a sample of sequential data such sequences, which could be very likely generated by this HMM. For example, a HMM was trained with members of a protein family, are used to find more members of this family in large protein databases.

History

Hidden Markov Models were first published by Leonard E. Tree and other authors in the second half of the 1960s. One of the first applications was in the mid- 1970s speech recognition. Since the mid- 1980s, HMMs were used for the analysis of nucleotide and protein sequences and have since been an integral part of bioinformatics.

Andrey Markov Automata theory Self-information Entropy (information theory) Forward-backward algorithm Baum–Welch algorithm Connectionism Leonard E. Baum

391068