Binomial distribution

The binomial distribution is one of the most important discrete probability distributions.

It describes the number of successes in a series of similar and independent experiments, each of exactly two possible outcomes have ( "success" or "failure" ). Such test series are also called Bernoulli processes.

If the probability of success in a trial and the number of attempts, then we denote by or the probability of achieving exactly successes (see section definition).

The binomial distribution and the Bernoulli trial can be illustrated with the help of Galtonbretts. This is a mechanical apparatus into which you throw balls. These then fall randomly in one of several compartments, wherein the division of the binomial distribution corresponds. Depending on the design, different parameters are possible.

Although the binomial distribution was already known long before the term was first used in 1911 in a book by George Udny Yule.

  • 3.1 symmetry
  • 3.2 Expectation value 3.2.1 proof
  • 3.3.1 proof
  • 3.7.1 proof
  • 4.1 Relationship to the Bernoulli distribution
  • 4.2 Transition to the normal distribution
  • 4.3 Transition to the Poisson distribution
  • 4.4 Relationship to the geometric distribution
  • 4.5 Relationship to the negative binomial distribution
  • 4.6 Relationship to the hypergeometric distribution
  • 4.7 Relationship to the multinomial distribution
  • 4.8 Relationship to the Panjer distribution
  • 4.9 Relationship to the beta distribution
  • 4:10 relation to the Pólya distribution
  • 5.1 Symmetrical binomial distribution (p = 1 /2)
  • 5.2 Dragging balls
  • 5.3 Number of people with birthday on the weekend
  • 5.4 Common anniversary in
  • 5.5 Confidence interval for a probability
  • 5.6 Utilization Model
  • 5.7 Statistical error of the class frequency histograms

Examples

The likelihood of a number of rolling is greater than 2, the possibility that this is not the case is. Suppose you roll 10 times (), then there is a small chance that not a single time a number is rolled greater than 2, or vice versa each time. The probability that one times rolls a number such () is described by the binomial distribution.

Frequently, the process described by the binomial distribution is also illustrated by a so-called urn model. In an urn, for example, had 6 balls, 2 of them black, the other white. You grab now 10 times in the urn, get a ball out, write down the color and put the ball back. In a particular interpretation of this process drawing a white ball is seen as a " positive event " with the probability of drawing a white ball is not as " negative result ". The chances are just as divided as in the previous example rolling the dice.

Definition of the binomial distribution

The discrete probability distribution with probability function

Is called the binomial distribution with the parameters (number of trials ) and ( the success or success probability ). Instead you write well.

The complementary probability of success for the probability of failure is frequently abbreviated. As for a probability distribution necessary, the probabilities for all possible values ​​of k must sum to 1. This follows from the binomial theorem as follows

One by distributed random variable is called a binomial distribution with parameters and. Thus it has the distribution function

Derivation as Laplace probability

Experimental scheme: An urn contains N numbered balls, of which M are black and white. The probability of drawing a black ball is thus p = M / N. There are individually and sequentially, by chance, a total of n balls removed, examined, and put back.

We calculate the number of ways in which one finds k black balls and from the so-called Laplace probability ( " number of favorable opportunities for the event divided by the total number of ( equally probable ) ways ").

In each of the drawings, there are N possibilities n, making a total possibilities for the selection of the balls. So exactly k of these n balls are black, exactly k of the n contractions must have a black ball. For each black ball there are M ways and for each white ball N-M options. The k black balls can be distributed across possible ways over the n draws, so there is

Cases in which exactly k black balls have been selected. The probability of finding exactly k black balls among n is therefore

Properties of the binomial distribution

Symmetry

  • The binomial distribution is in the special cases, and symmetric and asymmetric otherwise.
  • The binomial distribution has the property

Expected value

The binomial distribution has the expected value.

Evidence

The expected value μ is calculated directly from the definition and the binomial theorem to

Or alternatively, with the sum rule for expected values ​​, considering that the identical individual processes of the Bernoulli distribution with sufficient to B ( N, P ) dispersed and

Alternatively, you can also enter by means of the binomial theorem, the following evidence: Differentiating with the equation

Both sides, it follows

So

With and follows the desired result.

Variance

The binomial distribution has with the variance.

Evidence

Let X be a B (n, p) random variable. The variance is determined directly from the shift theorem to

Or, alternatively, the sum of control of the variance of independent random variables when considering that the identical individual processes of the Bernoulli distribution with sufficient to

The second equality holds because the individual experiments are independent, so that the individual variables are uncorrelated.

Coefficient of variation

From expectation and variance one obtains the coefficient of variation

Skew

The skewness is given by

Curvature

The curvature can also be represented as a closed

Mode

The mode, ie the value with the maximum likelihood is identical and equal. If a natural number is also a mode. If the expected value is a natural number, the expected value is equal to the mode.

Evidence

Be without restriction. We consider the quotient

Now applies if, and, if.

And only in case that the ratio is 1, i.e..

Characteristic function

The characteristic feature is in the form

Generating function

For the generating function is obtained

Moment generating function

The moment generating function of the binomial distribution is

Sum binomialverteilter random variables

For the sum of two independent random variables binomialverteilter and with the parameters, and, you get the individual probabilities

So again a binomial random variable, but with the parameters and.

If the sum is known, follows each of the random variables, and under this condition, a hypergeometric distribution. To this end, we can calculate the conditional probability:

This represents a hypergeometric distribution

In general, if the random variables are stochastically independent and satisfy the binomial distribution, then also the sum of a binomial distribution, but with parameters and.

Relationship to other distributions

Relationship to the Bernoulli distribution

A special case of the binomial distribution with n = 1, the Bernoulli distribution. The sum of independent and identical random variables Bernoulli distributed accordingly satisfies the binomial distribution.

Transition to the normal distribution

In the limit of the binomial distribution converges to a normal distribution, that is, the normal distribution can be used as a useful approximation to the binomial distribution when the sample size is sufficiently large and the proportion of this expression is not too small. (see the theorem of Moivre -Laplace )

It is true: and. By substitution in the distribution function of the normal distribution:

The convergence of the binomial distribution of the normal distribution is used in the normal approximation to quickly determine the probability of the binomial distribution of many stages

Transition to a Poisson distribution

An asymptotically asymmetric binomial distribution whose expected value for large and small converges to a constant, can be approximated by the Poisson distribution. The value is then for all considered in the limit formation binomial distributions as well as for the resulting Poisson distribution, the expected value. This approach is also called the Poisson approximation Poisson limit theorem or the law of rare events.

A rule of thumb states that this approximation is useful when and.

The Poisson distribution is therefore the limiting distribution of the binomial distribution for large and small.

Relationship to the geometric distribution

The number of failures for the initial entry of success is described by the geometric distribution.

Relationship to the negative binomial distribution

The negative binomial distribution, however, describes the probability distribution of the number of attempts required to obtain a given number of successes in a Bernoulli process.

Relationship with the hypergeometric distribution

In the binomial distribution, the selected samples are returned to the selection set, so they can be selected again at a later date. If, in contrast, the samples are not returned to the population, then the hypergeometric distribution is applied. Both go with a large population size and a small proportion of samples into one another. As a rule of thumb that can be advanced for the binomial distribution of mathematically sophisticated hypergeometric distribution, as they provide differing results only marginally.

Relationship with the multinomial distribution

The binomial distribution is a special case of the multinomial distribution.

Relationship to the Panjer distribution

The binomial distribution is a special case of Panjer distribution which combines the distributions Binomial, Negative Binomial and Poisson distribution into a distribution class.

Relationship to the beta distribution

For many applications it is necessary, the distribution function

Concretely calculate (for example, statistical tests or confidence intervals ).

This helps the following relationship to the beta distribution

This is for positive integer parameters a and b:

To the equation

To prove you can do the following:

  • The left and right side vote for p = 0 match (both sides are equal to 1).
  • The derivatives with respect to p are the same for the left and right side of the equation, they are both the same namely:

Relationship to the Pólya distribution

The binomial distribution is a special case of the Pólya distribution ( choose c = 0).

Examples

Symmetrical binomial distribution (p = 1 /2)

Subtracted mean

Scaling with standard deviation

This case occurs when n-fold tossing a fair coin ( probability of head equal to the number of, ie equal to 1 /2). The first figure shows the binomial distribution for and for different values ​​of as a function of. This binomial distributions are mirror symmetric about the value:

This is illustrated in the second figure. The width of the distribution of increases in proportion to the standard deviation. The function value, ie the maximum of the curve decreases proportionally.

Accordingly, one may binomial distributions with different scale to each other, while passing by dividing the abscissa and the ordinate multiplied with (third picture above).

The adjacent graph shows again rescaled binomial distributions, now for other values ​​of n and in a plot that illustrates better that all function values ​​converge with increasing n against a common curve. By applying the Stirling formula for the binomial coefficients, it can be seen that this curve (in the picture black solid line) is a Gaussian bell curve:

This is the probability density for standard normal distribution. In the central limit theorem, this finding is generalized to also follow other discrete probability distributions converge to the normal distribution.

In the second graphic besides the same data in a semilogarithmic plot, which is recommended if you want to check if rare events that differ by several standard deviations from the expected value follow a binomial or normal distribution.

Dragging balls

In a container there are 80 balls, 16 of which are yellow. It is 5 times removed a ball and then put back. Because of the Back -laying is the probability of drawing a yellow ball, all withdrawals of equal size, namely 16/80 = 1/5. The distribution gives the probability that exactly the withdrawn balls are yellow. As an example, we expect k = 3:

So in about 5 % of cases one takes exactly 3 yellow balls.

Number of people with birthday on the weekend

The probability that a person this year on a weekend 's birthday, amounts to (for simplicity ) 2 / 7th In one room you hold on to 10 people. The distribution is ( simplified model ) is the probability that exactly who the audience this year at a weekend birthday.

Common anniversary in

253 people have come together. The distribution indicates the probability that exactly have presences on a randomly selected day birthday (without consideration of the vintage ).

The likelihood that " anyone " these 253 persons, one or more persons, on this day your birthday is, therefore.

With 252 people, the probability is. That is, the threshold of the number of persons above which the probability that at least one of these persons has on a randomly selected day birthday is greater than 50%, is 253 persons ( see also birthday paradox).

The direct calculation of the binomial distribution can be difficult due to the large faculties. An approximation of the Poisson distribution is allowed ( n > 50, p < 0.05). The parameter = np = 253/365 results in the following values:

Confidence interval for a probability

In a poll of n people give to k people to choose party A. Determine a 95 % confidence interval for an unknown proportion of voters choose the party A, in the overall electorate.

A solution of the problem without resorting to the normal distribution can be found in Article confidence interval of an unknown probability.

Utilization model

By means of the following formula can calculate the probability that k of n people takes an activity that an average of m minutes per hour to run simultaneously.

Statistical error of the class frequency histograms

The representation of independent measurements in a histogram leads to the grouping of the measured values ​​into classes.

The probability of entries in class i is given by the binomial distribution with and.

Expected value and variance of are and then.

This is the statistical error of the number of entries in class i. With a large number of classes is small and.

Thus, for example, the statistical accuracy of the Monte Carlo simulations to determine.

Random numbers

Random numbers for the binomial distribution are usually generated using the inversion method.

126357
de