Exponential family

In probability theory and statistics, a exponential (or exponential family) is a class of probability distributions of a very specific form. We choose this particular form to take advantage of certain computational advantages or reasons of generalization. Exponential families are very natural distributions in a sense. The concept of exponential families goes back to EJG Pitman, G. Darmois, and BO Koopman ( 1935-6 ).

  • 3.1 Normal Distribution: unknown expected value, variance known
  • 3.2 Normal Distribution: unknown expected value, variance unknown
  • 3.3 binomial Distribution
  • 4.1 Classical treasures: Sufficiency
  • 4.2 Bayesian estimation: conjugate distributions
  • 4.3 Hypotheses tests: uniformly most powerful test

Definitions

It follows a series of ever more general definitions of an exponential family.

Scalar parameters

A single-parameter exponential family is a set of probability distributions whose density function (or in the discrete case: probability function ) can be represented in the following form:

The value is the parameter of the family.

Here is often a vector of realizations of a random variable. In this case, where the space of possible expressions of referred.

The exponential family is in canonical form if. By defining a transformed parameter, it is always possible to bring an exponential family in canonical form. The canonical form is not unique ( one can multiply an arbitrary constant non-zero, while T ( x) divided by the constant).

Below is an example of a normal distribution with unknown expectation with known variance.

Vectorial parameters

The single-parameter definition can be extended to a definition of a vector parameter. A family of distributions belongs to an exponential family if the vectorial density function ( or probability function of the discrete distribution) can be written in the following form:

As in the case with a scalar parameter, is spoken by an exponential family in canonical form if and only if for all.

Below you will find an example of a normal distribution with unknown mean and unknown variance.

Measure-theoretical formulation

We use distribution functions to both the discrete, as abzuhandeln also the continuous case at the same time.

Let H be a non- decreasing function of a real variable and H ( x) tends to 0 as x goes. Then Lebesgue Stieltjes integrals with respect to dH ( x) integral with respect to the " reference dimension" of the exponential, which is produced by H.

All members of this exponential, the ( cumulative ) distribution function

If F is a continuous distribution with a density, we can write dF (x ) = f ( x ) dx.

H ( x ) is a Lebesgue - Stieltjes integrator for the reference measure. Is the reference level at last, it can be normalized, and is H then the (cumulative ) distribution function of a probability distribution. If F is continuous with a density function, then the same is true for H, which can be written as follows dH (x ) = h (x ) dx. If F is discrete, then H is a step function ( with jumps on the support of F).

Interpretation

The functions and in the definitions above are arbitrary. However, they play an important role in the resulting probability distribution.

  • Is a sufficient statistic (also sufficient statistic ) of the distribution. Thus, there exists a sufficient statistic whose dimension is equivalent to the number to be estimated parameters for exponential families. This important property is considered in more detail below.
  • Is referred to as a natural parameter. The set of values ​​from, for which the function is finite, is called natural parameter space. It can be shown that the natural parameter space is always convex.
  • Is a normalization factor without which no probability distribution would be. The function A is itself important because in cases where the reference measure is a probability measure (alternatively: if a probability density ), A is the Kumulantenerzeugende function of the probability distribution of the suffizienten statistics when the distribution of is.

Examples

The normal, exponential, gamma, chi-square, beta, Dirichlet, Bernoulli, binomial, multinomial, Poisson, negative binomial and geometric are all exponential families. The Cauchy, Laplace, uniform distribution and Weibull distribution are not exponential families.

In the following we consider some distributions and how they can be written in the representation of exponential families.

Normal Distribution: unknown expected value, variance known

In the first example, we assume that is normally distributed with unknown mean and variance 1 The density is then

One can see that this is a single-parameter exponential family in canonical form, if one defines as follows:

Normal Distribution: unknown expected value, variance unknown

Next, we consider a normal distribution with unknown mean and unknown variance. The density is then

This is an exponential family, what you see when you defined as follows:

Binomial distribution

As an example of a discrete exponential family, we consider the binomial distribution. The likelihood function of this distribution

This can also be written as

Which indicates that it also is an exponential distribution with a binomial distribution. The natural parameter is

Role in the statistics

Classic treasures: Sufficiency

After the Pitman - Koopman - Darmois theorem there under probability families whose support does not depend on the parameters only in the exponential families suffiency of statistics whose dimension remains restricted with increasing sample size. A little more detail: Let Xn, n = 1, 2, 3, ... independent and identically distributed random numbers, the probability distribution family is known. Only if the family is an exponential family, there is a (possibly vectorial ) sufficient statistic T ( X1, ..., Xn) does not increase the number of scalar components, the sample size n should be increased.

Bayesian estimation: conjugate distributions

Exponential families are important for the Bayesian statistics. In Bayesian statistics, a priori probability distribution is multiplied by a likelihood function and then normalized to get to the a posteriori probability distribution. If the likelihood belongs to the exponential family, there is a conjugate of an a priori, which also is often an exponential family. A conjugate a priori π for the parameter η of the exponential family is defined by

Where and are hyper parameters ( parameters control the parameters).

A conjugated a priori is a priori, which, combined with a likelihood and the normalization term is a posterior distribution results, which is again of the type of a priori. For example, you can choose the beta distribution as a priori, if one wants to estimate the success probability of a binomial distribution. Since beta binomial distribution is conjugated to distribution, the posterior distribution is a beta distribution again. The use of conjugated A prioris simplifies calculation of Posterioriverteilung.

In general, the likelihood is not an exponential family, so there is generally no conjugated priori distribution. The posterior then has to be calculated by numerical methods.

Hypotheses tests: uniformly most powerful test

The single-parameter exponential family has a monotone non- falling Likelihood Ratio in the suffizienten statistic T (x ), where η ( θ ) is non- decreasing. It follows that a uniformly most powerful test exists to test the hypothesis H0: θ ≥ θ0 vs. H1: θ to test < θ0.

Credentials

323490
de