Chi-squared distribution

The chi-square distribution ( distribution ) is a continuous probability distribution over the set of positive real numbers. Typically, with "chi -square distribution ", the central chi-square distribution is meant. Its only parameter must be a natural number and is called the degree of freedom.

It is one of the distributions, which can be derived from the normal distribution. If you have random variables that are independent and standard normally distributed, the chi -square distribution with degrees of freedom is defined as the distribution of the sum of squared random variables. Such sums of squared random variables occur in the estimation of the variance of a sample. The chi -square distribution thus allows, among other things an opinion on the compatibility of a presumed functional relationship ( a function of time, temperature, pressure, etc. ) with empirically determined measuring points. Can explain a straight line the data, for example, or you need but a parable or perhaps a logarithm? You choose from different models, and the one with the best goodness of fit, the smallest, offers the best explanation of the data. Thus, the distribution by the quantification of the random fluctuations in the selection of different explanation models on a numerical basis. They also allowed if you have determined the sample variance, the estimation of the confidence interval, which includes the ( unknown ) value of the variance of the population with a certain probability. These and other applications are described below and in the article Chi -square test.

It was introduced in 1876 by ​​Friedrich Robert Helmert, the name comes from Karl Pearson ( 1900).

  • 5.1 Derivation of the distribution of the sample variance
  • 6.1 Relationship to the gamma distribution
  • 6.2 Relationship with the normal distribution
  • 6.3 Relationship to the exponential distribution
  • 6.4 Relationship to the Erlang distribution
  • 6.5 Relationship to F-distribution
  • 6.6 Relationship to the Poisson distribution
  • 6.7 Relationship to the continuous uniform distribution
  • 8.1 quantile function for small sample size
  • 8.2 approximation of quantile function for fixed probabilities

Definition

The chi-square distribution with degrees of freedom describes the distribution of the sum of squared stochastically independent standard normally distributed random variables

The sign is shorthand for " is distributed as ". The sum of squared variables can not assume negative values ​​.

In contrast, applies to the simple sum of a symmetrical distribution about the zero point.

Density

The density of the distribution of degrees of freedom of the form:

It stands for the gamma function. The values ​​of one can also calculate

Distribution function

The distribution function can be written with the help of the regularized incomplete gamma function:

When n is a natural number, the distribution function may be ( more or less) are presented Elemental:

Where Erf denotes the error function. The distribution function describing the likelihood that on the interval [ 0, x] is.

Properties

Expected value

The expected value of the chi-square distribution with degrees of freedom

Assuming a standard normal distributed population should therefore be at the correct estimate of the population variance, the value near 1.

Variance

The variance of the chi-square distribution with degrees of freedom

Mode

The mode of the chi-square distribution with degrees of freedom is for.

Skew

The skewness of the chi-square distribution with degrees of freedom

The chi -square distribution has a positive skewness, ie she is left partially or skewed to the right. The higher the number of degrees of freedom, the less is the wrong distribution.

Kurtosis

The kurtosis ( curvature ) of the chi-square distribution with degrees of freedom is given by

.

The excess over the normal distribution is found to be so. Therefore, the higher the number of degrees of freedom, the lower the excess.

Characteristic function

The characteristic feature is in the form

Entropy

The entropy of the chi-square distribution (expressed in nats ) is

Where ψ ( p) is the digamma - function referred.

Sum of χ2 - distributed random variables

Are independent χ2 - distributed random variables with, then:

This includes the standard normal random variable independent, and therefore the sum is again χ2 - distributed. The chi-square distribution is thus reproductive.

Non- central chi -square distribution

If the normal random variable with respect to its expected value are not centered (ie if not all ), we obtain the non-central chi-square distribution. She has served as the second parameter in addition to the non -centrality parameter.

Be, is so

In particular, from this and that is.

A second way to create a non- central chi-square distribution, a mixed distribution of the central chi-square distribution. It is

Is pulled from a Poisson distribution.

Density function

Is the density function of the non-central Chi -square distribution

The density function can be alternatively represented with the help of the modified Bessel function of the first kind:

Expected value and variance of the non-central chi -square distribution and go as well as the density over even with the corresponding expressions of the central chi-square distribution.

Distribution function

The distribution function of the non-central chi-square distribution can be expressed using the Marcum - Q function.

Example

It makes measurements of a size which come from a normally distributed population. Is the average of the measured values ​​and

The sample variance. Can then be reduced, for example, the 95 % confidence interval for the variance specify:

Which by and is determined by, and therefore. The limitations arise from the fact that as distributed.

Derivation of the distribution of the sample variance

Be a sample of readings, drawn from a normally distributed random variables with arithmetic mean and sample variance than estimators for the mean and variance of the population.

Then it can be shown that distributed is like.

For this, the transformed into new variables by means of an orthonormal linear combination according to Helmert. The transformation is:

The new independent variables are as normally distributed with the same variance, but with an expected value both because of the Faltungsinvarianz the normal distribution.

Moreover, for the coefficients in (if is ) because of the orthonormality ( Kronecker delta ) and thus

Therefore, there is now

And finally, after division by

The expression on the left is apparently distributed as a sum of squared standard normal variables with independent summands, as required for.

Thus, by definition, is to say during the chi-square sum. A degree of freedom is here ' consumed ', because in contrast to the population mean of the calculated mean value is dependent on the.

Relationship to other distributions

Relationship to the gamma distribution

The chi-square distribution is a special case of the gamma distribution. Is so true

Relation to the normal distribution

  • The sum of squared independent standard normal random variables satisfies a chi-square distribution with degrees of freedom.
  • For is approximately a standard normal distribution.
  • For the random variable is approximately normally distributed, with mean and standard deviation or in the case of a non- central chi-square distribution with expected value and standard deviation.

Relationship to the exponential distribution

A chi -square distribution with 2 degrees of freedom is an exponential distribution with parameter.

Relationship with the Erlang distribution

A chi-square distribution with degrees of freedom is identical with an Erlang distribution with degrees of freedom.

Relationship to the F-distribution

If and independent distributed random variables with degrees of freedom and are, then the quotient

A random variable which satisfies the F-distribution with degrees of freedom.

Relationship to the Poisson distribution

Distribution functions of the Poisson distribution and the distribution are related in the following manner:

The probability of finding, or more events in an interval within which events are expected on average, equal to the probability that the value of is. It is namely

With and as a regularized gamma functions.

Relationship to the continuous uniform distribution

For just you can see the distribution as - fold convolution form with the help of uniformly continuous density:

Where the independent random variables are uniformly continuous.

Other hand, applies to odd

Derivation of the density function

The density of the random variable, with independent and standard normally distributed, results from the joint density of the random variable. This joint is the density times the product of the standard normal distribution:

For the desired density applies:

With

The limit is the sum of the argument of the exponential equal to z, it should not therefore be taken outside the integral and the limit.

The remaining integral

Corresponds to the volume of the shell between the sphere with radius and the sphere with radius

Wherein the volume of the n-dimensional sphere with the radius R indicates.

It follows:

And after insertion into the expression for the required density:

Quantile

Quantile of the distribution is to solve the equation, thereby calculating principle of the inverse function. Specifically applies here

With the inverse of the regularized incomplete gamma function. This value is in the Quantiltabelle and registered under the coordinates.

Quantile function for small sample size

For a few values ​​(1, 2, 4 ), one can also specify the quantile function alternatively:

Where the error function, the lower branch of the Lambert W function and called the Euler number.

Approximation of quantile function for fixed probabilities

For certain fixed probabilities, the corresponding quantiles leave by the simple function of the sample size

Approach with the parameters from the table where the signum function called, which simply represents the sign of its argument:

The comparison with a table showing from a relative error of less than 0.4 %, from less than 0.1 %. Since the distribution of large in a normal distribution merges with standard deviation has the parameters from the table that was fitted freely here, in the corresponding probability is about the size of the fold of the percentile of the normal distribution (), where the inverse function of the error function means.

The 95% - confidence interval for the variance from the example section can be represented graphically, for example, with the two functions from the lines and in a simple way as a function of.

The median is in the column of the table with

18238
de