Chi-squared distribution
The chi-square distribution ( distribution ) is a continuous probability distribution over the set of positive real numbers. Typically, with "chi -square distribution ", the central chi-square distribution is meant. Its only parameter must be a natural number and is called the degree of freedom.
It is one of the distributions, which can be derived from the normal distribution. If you have random variables that are independent and standard normally distributed, the chi -square distribution with degrees of freedom is defined as the distribution of the sum of squared random variables. Such sums of squared random variables occur in the estimation of the variance of a sample. The chi -square distribution thus allows, among other things an opinion on the compatibility of a presumed functional relationship ( a function of time, temperature, pressure, etc. ) with empirically determined measuring points. Can explain a straight line the data, for example, or you need but a parable or perhaps a logarithm? You choose from different models, and the one with the best goodness of fit, the smallest, offers the best explanation of the data. Thus, the distribution by the quantification of the random fluctuations in the selection of different explanation models on a numerical basis. They also allowed if you have determined the sample variance, the estimation of the confidence interval, which includes the ( unknown ) value of the variance of the population with a certain probability. These and other applications are described below and in the article Chi -square test.
It was introduced in 1876 by Friedrich Robert Helmert, the name comes from Karl Pearson ( 1900).
- 5.1 Derivation of the distribution of the sample variance
- 6.1 Relationship to the gamma distribution
- 6.2 Relationship with the normal distribution
- 6.3 Relationship to the exponential distribution
- 6.4 Relationship to the Erlang distribution
- 6.5 Relationship to F-distribution
- 6.6 Relationship to the Poisson distribution
- 6.7 Relationship to the continuous uniform distribution
- 8.1 quantile function for small sample size
- 8.2 approximation of quantile function for fixed probabilities
Definition
The chi-square distribution with degrees of freedom describes the distribution of the sum of squared stochastically independent standard normally distributed random variables
The sign is shorthand for " is distributed as ". The sum of squared variables can not assume negative values .
In contrast, applies to the simple sum of a symmetrical distribution about the zero point.
Density
The density of the distribution of degrees of freedom of the form:
It stands for the gamma function. The values of one can also calculate
Distribution function
The distribution function can be written with the help of the regularized incomplete gamma function:
When n is a natural number, the distribution function may be ( more or less) are presented Elemental:
Where Erf denotes the error function. The distribution function describing the likelihood that on the interval [ 0, x] is.
Properties
Expected value
The expected value of the chi-square distribution with degrees of freedom
Assuming a standard normal distributed population should therefore be at the correct estimate of the population variance, the value near 1.
Variance
The variance of the chi-square distribution with degrees of freedom
Mode
The mode of the chi-square distribution with degrees of freedom is for.
Skew
The skewness of the chi-square distribution with degrees of freedom
The chi -square distribution has a positive skewness, ie she is left partially or skewed to the right. The higher the number of degrees of freedom, the less is the wrong distribution.
Kurtosis
The kurtosis ( curvature ) of the chi-square distribution with degrees of freedom is given by
.
The excess over the normal distribution is found to be so. Therefore, the higher the number of degrees of freedom, the lower the excess.
Characteristic function
The characteristic feature is in the form
Entropy
The entropy of the chi-square distribution (expressed in nats ) is
Where ψ ( p) is the digamma - function referred.
Sum of χ2 - distributed random variables
Are independent χ2 - distributed random variables with, then:
This includes the standard normal random variable independent, and therefore the sum is again χ2 - distributed. The chi-square distribution is thus reproductive.
Non- central chi -square distribution
If the normal random variable with respect to its expected value are not centered (ie if not all ), we obtain the non-central chi-square distribution. She has served as the second parameter in addition to the non -centrality parameter.
Be, is so
In particular, from this and that is.
A second way to create a non- central chi-square distribution, a mixed distribution of the central chi-square distribution. It is
Is pulled from a Poisson distribution.
Density function
Is the density function of the non-central Chi -square distribution
The density function can be alternatively represented with the help of the modified Bessel function of the first kind:
Expected value and variance of the non-central chi -square distribution and go as well as the density over even with the corresponding expressions of the central chi-square distribution.
Distribution function
The distribution function of the non-central chi-square distribution can be expressed using the Marcum - Q function.
Example
It makes measurements of a size which come from a normally distributed population. Is the average of the measured values and
The sample variance. Can then be reduced, for example, the 95 % confidence interval for the variance specify:
Which by and is determined by, and therefore. The limitations arise from the fact that as distributed.
Derivation of the distribution of the sample variance
Be a sample of readings, drawn from a normally distributed random variables with arithmetic mean and sample variance than estimators for the mean and variance of the population.
Then it can be shown that distributed is like.
For this, the transformed into new variables by means of an orthonormal linear combination according to Helmert. The transformation is:
The new independent variables are as normally distributed with the same variance, but with an expected value both because of the Faltungsinvarianz the normal distribution.
Moreover, for the coefficients in (if is ) because of the orthonormality ( Kronecker delta ) and thus
Therefore, there is now
And finally, after division by
The expression on the left is apparently distributed as a sum of squared standard normal variables with independent summands, as required for.
Thus, by definition, is to say during the chi-square sum. A degree of freedom is here ' consumed ', because in contrast to the population mean of the calculated mean value is dependent on the.
Relationship to other distributions
Relationship to the gamma distribution
The chi-square distribution is a special case of the gamma distribution. Is so true
Relation to the normal distribution
- The sum of squared independent standard normal random variables satisfies a chi-square distribution with degrees of freedom.
- For is approximately a standard normal distribution.
- For the random variable is approximately normally distributed, with mean and standard deviation or in the case of a non- central chi-square distribution with expected value and standard deviation.
Relationship to the exponential distribution
A chi -square distribution with 2 degrees of freedom is an exponential distribution with parameter.
Relationship with the Erlang distribution
A chi-square distribution with degrees of freedom is identical with an Erlang distribution with degrees of freedom.
Relationship to the F-distribution
If and independent distributed random variables with degrees of freedom and are, then the quotient
A random variable which satisfies the F-distribution with degrees of freedom.
Relationship to the Poisson distribution
Distribution functions of the Poisson distribution and the distribution are related in the following manner:
The probability of finding, or more events in an interval within which events are expected on average, equal to the probability that the value of is. It is namely
With and as a regularized gamma functions.
Relationship to the continuous uniform distribution
For just you can see the distribution as - fold convolution form with the help of uniformly continuous density:
Where the independent random variables are uniformly continuous.
Other hand, applies to odd
Derivation of the density function
The density of the random variable, with independent and standard normally distributed, results from the joint density of the random variable. This joint is the density times the product of the standard normal distribution:
For the desired density applies:
With
The limit is the sum of the argument of the exponential equal to z, it should not therefore be taken outside the integral and the limit.
The remaining integral
Corresponds to the volume of the shell between the sphere with radius and the sphere with radius
Wherein the volume of the n-dimensional sphere with the radius R indicates.
It follows:
And after insertion into the expression for the required density:
Quantile
Quantile of the distribution is to solve the equation, thereby calculating principle of the inverse function. Specifically applies here
With the inverse of the regularized incomplete gamma function. This value is in the Quantiltabelle and registered under the coordinates.
Quantile function for small sample size
For a few values (1, 2, 4 ), one can also specify the quantile function alternatively:
Where the error function, the lower branch of the Lambert W function and called the Euler number.
Approximation of quantile function for fixed probabilities
For certain fixed probabilities, the corresponding quantiles leave by the simple function of the sample size
Approach with the parameters from the table where the signum function called, which simply represents the sign of its argument:
The comparison with a table showing from a relative error of less than 0.4 %, from less than 0.1 %. Since the distribution of large in a normal distribution merges with standard deviation has the parameters from the table that was fitted freely here, in the corresponding probability is about the size of the fold of the percentile of the normal distribution (), where the inverse function of the error function means.
The 95% - confidence interval for the variance from the example section can be represented graphically, for example, with the two functions from the lines and in a simple way as a function of.
The median is in the column of the table with