Beta-binomial distribution

Beta binomial distribution is a univariate discrete probability distribution, which can be regarded as a kind of a generalization of the binomial distribution, as in this, the probability of success for a given probability of a single success is indicated, whereas in beta binomial distribution the probability of success is only vaguely known, and Beta distribution is described by a B ( a, b). It thus is a mixed distribution.

Beta binomial distribution has three parameters: s, a, b


A random variable has a beta binomial distribution with parameters and, in character, if each of the carrier:

The constant C is calculated as follows

With the gamma function.

An alternative notation is

Where the beta function.


The expected value depends on all three parameters:

As the variance:

The skewness is given as

Special cases

If a = 1 and b = 1, then it is with a discrete uniform distribution, since the carrier contains values ​​.


The beta - binomial distribution is typically applied in cases where one would normally use a binomial distribution, but can not assume that all individual events have the same probability of entering, but these probabilities are bell-shaped, more or less to a value.

If you want to know, for example, how many bulbs will fail within the next 12 months, but assumes that the probability of failure of bulb is different for light bulb, then a beta - binomial distribution is appropriate.

Empirically, suppose that you are dealing with a beta - binomial distribution to, even though you 'd rather think of a binomial model, if the data scatter more than expected from the binomial distribution.


Model in Bayesian statistics

An urn contains an unknown number of balls, of which we know from other samples that the proportion of red balls is described by a beta distribution.

There are n times balls are drawn ( with replacement). The probability that x times a red ball is drawn, is in the beta - binomial distribution.

Numerical example

Starting from a complete ignorance of the a priori distribution, which is described by a ( Alternatives are for example ), a " preliminary study " with a drawing will be organized ( with repetition ) of 15 balls. One of these balls is red. Thus, the a posteriori distribution is described by the.

The actual " study" provides a draw of 40 balls. What is needed is the probability that exactly two times a red ball is drawn.

Since in this second drawing, the probability that one is, they can be calculated as follows:

In which

And there and also is generally obtained

This result differs significantly from that which would have been calculated with a "simple" binomial distribution, from. In this case would be the result.

From the graph it is evident that the "simple" binomial fewer results " allow " than that. This happens because you do not neglected in the Bayesian model, that the "true" proportion of red balls is basically unknown, and thus scatter the results stronger.