Hypergeometric distribution

The hypergeometric distribution is a discrete probability distribution in the stochastics.

It is assumed a dichotomous population. This whole be taken in a random sample of items one by one without replacement. The hypergeometric distribution then provides information about the probability in the sample a certain number of elements occurs, have the desired property. Meaning comes this distribution to approximately in quality control.

The hypergeometric distribution is a model without replacement associated with the urn model (see also combination without repetition ). Consider, specifically in this context, an urn with two kinds of balls. It will be removed without replacement balls. The random variable is the number of balls of the first type in this sample.

The hypergeometric distribution thus describes the probability that for given elements ( " population of the periphery " ), of which have the desired property, the picking of specimens ( " sample of size " ) exactly matches are obtained, ie the probability of successes in trials.

An exemplary problem: an urn contains 45 balls, 20 of which are yellow. How high is the probability to draw exactly 4 yellow balls in a 10 -element sample? - The sample will be calculated below.

  • 3.1 Relationship to the binomial distribution
  • 3.2 Relationship to the Pólya distribution
  • 4.1 Detailed calculation example for the balls

Definition

The hypergeometric distribution is dependent on three parameters:

  • The number of elements of a population.
  • The number of elements having a particular property in the basic quantity ( the number of possible successes ).
  • The number of elements in a sample.

The distribution is now information about how likely it is that there are elements with the property to be tested ( successes or results) in the sample. The sample space is therefore.

A discrete random variable subject to the hypergeometric distribution with parameters, and when the probabilities

For has. In this case, the binomial coefficients called "N over n ".

The distribution function then the probability that most elements of the property to be tested in the sample is. This cumulative probability is the sum

Properties of the hypergeometric distribution

Symmetry

Expected value

The expected value of the hyper- geometrically distributed random variable

Variance

The variance of the hypergeometric random variable distributed

Where the last break of the so-called correction factor ( finite population correction ) for the model without replacement is.

Characteristic function

The characteristic function takes the following form:

Relationship to other distributions

Relationship to the binomial distribution

In contrast to the binomial distribution, the sample can not be put back into the reservoir for re- selection at the hypergeometric distribution. If the sample size is relatively small ( approximately ) compared to the population size, the calculated by the binomial or hypergeometric distribution probabilities do not differ significantly from each other. In such cases, the approximation is carried out by mathematically easier to handle binomial then often.

Relationship to the Pólya distribution

The hypergeometric distribution is a special case of the Pólya distribution ( choose ).

Examples

In a container there are 45 balls, of which 20 are yellow. It will be removed 10 balls without replacement.

The hypergeometric distribution gives the probability that exactly x = 0, 1, 2, 3, ..., 10 of the withdrawn balls are yellow.

An example of the application of the hypergeometric distribution is the Lottery: The lottery there are 49 numbered balls; which are drawn at the draw 6; on the lottery ticket numbers are ticked 6.

Gives the probability for exactly x = 0, 1, 2, 3, ..., to achieve 6 "hits".

  • Probability of the German Lotto

In a logarithmic plot

Detailed calculation for the balls

For the example above, the colored balls, the probability to be found that exactly 4 yellow balls result.

So.

The probability is given by:

There are

Ways to select exactly 4 yellow balls.

There are

Ways to choose exactly 6 purple spheres.

Since each "yellow option " with each " violet possibility " can be combined, resulting

Possibilities for exactly 4 yellow and 6 purple spheres.

There are a total

Ability to hold 10 balls.

Thus we obtain the probability

That is, in about 27 percent of cases are exactly 4 yellow ( and purple 6 ) balls removed.

Alternatively, the result can be found using the following equation

There are in the sample of size is 4 yellow balls. The remaining yellow balls (16 ) are located in the 35 remaining balls that do not belong to the sample.

Numerical values ​​of the examples

308068
de