Probability density function
In probability theory, the probability density function, often short density function, probability density or just density ( abbreviated WDF or pdf of Engl. Probability density function ), called a tool to describe a continuous probability distribution. The integration of the probability density function over an interval gives the probability that a random variable takes on a value between the density and. The probability density can assume values greater than 1 and should not be confused with the probability itself.
Formally, it is a density with respect to Lebesgue measure.
While in the discrete case probabilities of events by summing the probabilities of the individual elementary events can be calculated ( an ideal cube shows, for example, any number with a probability of ), this is no longer true for the continuous case. For example, two people are hardly exactly the same size, but only up to a hair's breadth or less. In such cases, the probability density functions are useful. With the help of these functions to estimate the probability for an arbitrary interval - for example, a height between 1.80 m and 1.81 m - determine, although infinitely many values lie in this interval, of which each individual has the chance.
It is a probability measure and is a real-valued random variable. A function is called a probability density of the random variable (or more precisely, the distribution of ), if the following holds
For all real numbers.
Standardization and uniqueness
The area under the distribution function has the content 1, i.e.
The probability density is uniquely determined up to deviations in a set of measure zero. It is always non-negative and can be (as opposed to probability ) assume arbitrarily large values , as is evident from the picture.
Conversely Any function with for all and is the density function of a uniquely determined probability distribution.
Calculation of probabilities
The probability of an interval can be calculated using the probability density or the associated cumulative distribution function as
This formula also applies to the intervals, and because individual points have at random variables with the probability density; the distribution function is continuous.
For more complex quantities, the probability can be determined analogously by integrating over subintervals. In general, the probability takes the form
Condition for the existence of a probability density
Has a probability density if and only if the distribution of absolutely continuous with respect to Lebesgue measure, i.e., when
For every Lebesgue null set ( set of Radon Nikodým ).
Context of the distribution function and density function
The (cumulative ) distribution function is formed as an integral over the density function:
Conversely, if the distribution function is differentiable, their derivation is a function of the density distribution:
This also applies even if it countably many points at which is continuous but not differentiable; which values to use at these locations for is irrelevant.
In general, there exists a density function if and only if the distribution function is absolutely continuous. This condition implies among other things that is continuous almost everywhere and has a discharge line, which corresponds to the density.
It should however be noted that there are distributions as the Cantor distribution having a continuous almost everywhere differentiable distribution function, but not the probability density. Differentiable almost everywhere distribution functions are always, but the corresponding derivation generally recorded only the absolutely continuous portion of the distribution.
Densities on subintervals
The probability density function of a random variable that takes values only in a partial interval of real numbers can be chosen such that it is set outside the range. An example is the exponential distribution. Alternatively, the probability density may be considered as a function, i.e., a density of distribution in relation to the Lebesgue measure on.
Using density functions and expectation values can be calculated. If a random variable with density, then for their expected value
If the integral exists. More generally, in the case of the existence
Thanks for and otherwise a density function is given, as is the whole of non-negative and it is
For the corresponding distribution function is obtained for
And for all generations. If a random variable with density, so therefore follows for example
For the expectation value of results
Other examples of probability densities are found in the list of univariate probability distributions.
Multidimensional random variable
You can also probability densities for multi-dimensional random variables, ie defined for random vectors. Is a - valued random variable, ie, a probability density function ( with respect to the Lebesgue measure ) of the random variable, if the following holds
For all Borel sets.
Especially then follows for dimensional intervals with real numbers:
The term of the distribution function can also be extended to multi-dimensional random variables. Here is to be read componentwise in the notation that a vector and the characters. So in this case a picture of in the interval [0,1 ], and it is
If n- times continuously differentiable, we obtain a probability density by partial differentiation:
The densities of the component variables can be interpreted as densities of the marginal distributions calculated by integrating over the other variables.
Furthermore: If a - valued random variable with density, so are equivalent:
- Has a density of form, the real probability density is.
- The random variables are independent.
Estimate a probability density based on discrete data
Discrete recognized, but actually continuous data ( For example, the height in centimeters ) can be represented as a frequency density. The histogram thus obtained is a piecewise constant estimate of the density function. Alternatively, for example, can be estimated by a continuous function with the so-called kernel density estimates, the density function. The core used for this purpose should match the expected measurement error.