Conditional expectation

The conditional expectation describes in probability theory and statistics the expected value of a random variable under the condition that additional information about the outcome of the underlying random experiment are available. The condition may consist, for example, that it is known whether a certain event has occurred or which values ​​a further random variable has accepted; abstract the additional information can be seen as sub-space of the underlying event space.

Abstract conditional expectations, and as a special case of conditional probabilities generalize in probability theory and statistics the elementary concept of conditional probability.

Conditional expectation values ​​play an important role in modern stochastics, for example, in the study of stochastic processes, and are used inter alia in the definition of martingales.

  • 5.1 An example of
  • 5.2 The approach of Kolmogorov
  • 6.1 Special Cases


The formation of the conditional expectation is a kind of smoothing of a random variable on a sub - σ - algebra. σ - algebras to model the information available, and a smoothed version of the random variable on a part - σ - algebra is already measurable, contains less information about the outcome of a random experiment. With the formation of the conditional expectation of a reduction of the observation depth is accompanied by the conditional expectation reduces the information about a random variable on a simpler in terms of measurability random variable, similarly reduced as an extreme case, the expected value of a random variable, the information on a single number.


That in some respects very old concept ( already Laplace has calculated conditional densities ) was formalized by Andrei Kolmogorov in 1933 using the set of Radon - Nikodym. In works by Paul Halmos 1950 and Joseph L. Doob 1953 conditional expectations were transferred to the now common form of sub - σ - algebras on abstract spaces.


When an event is added, gives the conditional probability

How likely the event, if it has the information that the event has occurred. According gives the conditional expectation

Which value is expected for the random variable in the middle, when you have the information that the event has occurred. Here, the indicator function, that is the random variable which takes the value when entering and when not.

Example: is the number of eyes when throwing a regular cube and be the event, a 5 or 6 dice. Then

However, this basic concept of conditional probabilities and expected values ​​is often not sufficient. Wanted are often rather conditional probabilities and conditional expectations in the form

(a) and,

(b) and,

(c) and,

The expressions in ( b ) and ( c ) are in contrast to ( a) self random variables, because they still depend on the random variable or the realization of events. is spoken under the condition B often expected value of Y. and is the expected value of Y given X and given expected value of Y spoken.

The indicated variants of conditional probabilities and expected values ​​are all related to each other. In fact, it is sufficient to define only one variant, because all can be derived from each other:

  • Conditional probabilities and conditional expectations include the same: Conditional expectation values ​​can be, just like ordinary expectation values ​​, calculated as sums or integrals of conditional probabilities. Conversely, the conditional probability of an event is simply the conditional expectation of the indicator function of the event:.
  • The variations in (a) and (b) are equivalent. The random variable has for the result to the value, i.e., the value obtained, by observing the value. Conversely, one can, if given, will always find a dependent of expression, so that this relation is satisfied. The same applies for conditional expectations.
  • The variations in (b) and (c) are also equivalent, because it can select as the set of events of the form ( produced by σ algebra ), and vice versa when the family.

Discrete event

We consider here the case that for all values ​​of true. This case is particularly simple to treat because the elementary definition is fully applicable:

The function ( where the argument designated ) has all the properties of a probability measure, it is a so-called regular conditional probability. The conditional distribution of a random variable is therefore also quite a common probability distribution. The expected value of this distribution is the conditional expectation of, if:

Is also discrete, so true

Being summed over all in the range of.


And be the eyes numbers for two independent tosses of a regular cube and the eyes sum. The distribution is given by. If we but the result of the first throw, and know that we have diced the value, for example, we obtain the conditional distribution

The expected value of this distribution, the conditional expectation of given is

More generally, for arbitrary values ​​of

If we use for the value of, we obtain the conditional expectation of, if:

This expression is a random variable; if the result has occurred, has the value, and the value

Theorem on the total probability

The probability of an event can be calculated by decomposing according to the values ​​of:

More generally, for each event in the σ - algebra the formula

Using the transformation formula for the size we obtain the equivalent formulation

General case

In the general case, the definition is far less intuitive than in the discrete case, because you can no longer assume that the events to which they related, have a chance.

An example of

We consider two independent standard normal random variables and. Without much reflection, they can also here the conditional expectation given, specify the random variable, that is, the value that you expect in the middle of the term, if you know:

As before, is itself a random variable, its value only which is critical of generated σ - algebra. ( Putting about, so likewise obtained. )

The problem arises from the following consideration: The equations given assume that for each value of a standard normal distribution. In fact, one could also assume that in the case of constant has a value and is a standard normal distribution only in the other cases: Since the event has the probability would be total and still independent and standard normally distributed. But you would receive instead. This shows that the conditional expectation is not clearly defined, and that it only makes sense to define the conditional expectation for all values ​​of simultaneously because you can freely modify it for individual values ​​.

The approach of Kolmogorov

After the elementary definition can not be transferred to the general case, the question arises which properties you want to keep and what you are willing to do without. The now generally accepted approach, which goes back to Kolmogorov (1933 ) and has proven especially in the theory of stochastic processes to be useful, requires only two properties:

( 1) should be a measurable function. Transferred to the σ - algebra this means that a - measurable random variable should be.

( 2) In analogy to the theorem on the total probability is for each equation

Be satisfied.

  • That conditional probabilities are clearly defined,
  • That there is always a probability,
  • The property.

Conditional expected values ​​(2) has the form

For all quantities for which the integrals are defined. With indicator functions can write this equation as

In this form, the equation is used in the following definition.

Formal definition

Given a probability space and a part - σ - algebra.

( 1) is a random variable whose expected value exists. The conditional expectation of given, is a random variable that satisfies the following two conditions:

  • Is - measurable and
  • Applies to all.

Two different conditional expectation values ​​of given ( " versions of the conditional expected value " ) is at most differ on a ( contained in ) null set. This allows the uniform notation for a conditional expectation of given justify.

The notation denotes the conditional expectation of, given the heat generated by the random variables σ - algebra.

(2) the conditional probability of the event, if, is defined as the random variable

That is, as the conditional expected value of the indicator function of.

Since the conditional probabilities of various events are thus defined without reference to each other and are not clearly defined, no probability measure must be in general. However, if this is the case, that is, if one can summarize the conditional probabilities of a stochastic kernel from to,

Is called a regular conditional probability. A concrete version of the conditional expectation is then used as integral


Factorization: The conditional expectation value (ie a function of ) is defined as a random variable can be represented as a function of: There is a measurable function such that

Thus one can formally define individual values ​​conditional expectations:

When using such expressions extreme caution because of the lack of uniqueness in the general case.

Existence: the general existence of conditional expectation values ​​for integrable random variables ( random variables that have a finite expectation value ), thus in particular of conditional probabilities follows from the theorem of Radon - Nikodym; the definition says nothing other than that a density of the signed measure with respect to the measure is defined on both the measurement space. The definition can be generalized even slightly, so that one can also detect cases such as for a Cauchy distributed random variable.

Regular conditional probabilities, in factorized form, exist in Polish spaces with the Borel σ - algebra, general applies: If an arbitrary random variable with values ​​in a Polish space, then there exists a version of the distribution in the form of a stochastic core:

Special cases

( 1) For the trivial σ - algebra is simple expectation values ​​and probabilities obtained:

According applies and for all in condition on the value of a constant random variable.

( 2) Simple σ - algebras: It has, and has but himself and the empty set no subsets, the value of on the conventional conditional probability is the same:

This shows that the above calculations are consistent in the discrete case with the general definition.

( 3) Calculating with densities: Is a limited density function of the joint distribution of random variables, then

The density of a regular conditional distribution in the factorized form and for the conditional expectation applies

( 4) In the following cases, you can specify regular conditional distributions:

  • If independently of, in the form
  • If, measurable in the form of ( Diracmaß )
  • For the pair, is measured when, in the form if a regular conditional distribution is used to calculate the expression on the right hand side.

Calculation rules

All of the following statements are valid only almost certain ( almost - everywhere), if they contain conditional expectations. Instead of one can write also a random variable.

  • Pulling out independent factors: Is independent of, the following applies.
  • Is independent of and so true.
  • Are independent, independent of and independently of, the following applies
  • Pulling out known factors: Actual measurable, so true.
  • Actual measurable, so true.
  • Total expected value.
  • Tower property: For Sub - σ - algebras.
  • Linearity: It applies to all generations.
  • Monotony: From follows.
  • Monotone Convergence: Off and follows.
  • Dominated Convergence: From and follows.
  • Lemma of Fatou: From follows.
  • Jensen's inequality: If a convex function, the following applies.
  • Conditional expectation values ​​as projections: The previous characteristics (especially the extraction of known factors and the tower property) imply for - measurable,
  • Conditional variance: Using conditional expectations can be considered analogous to the definition of the variance as the mean square deviation from the expected value of the conditional variance. He apply the shift theorem
  • Martingalkonvergenz: For a random variable that has a finite expectation value is true if either is an ascending sequence of sub - σ - algebras and or if a descending sequence of sub - σ - algebras and.

Other examples

(1 ) We consider the example from the discrete case from above. and be the eyes numbers for two independent tosses of a regular cube and the eyes sum. The calculation of the conditional expectation of given, is simplified by using the calculation rules; first applies

As a function of a measurable, and is independent of, and applies. So we obtain

( 2) If and independent and Poisson distributed with parameters and are, then the conditional distribution of, given a binomial distribution with parameters and, ie

It is thus and thus.

(3 ) We consider independent exponentially distributed random variables (or " waiting times " ) with rate parameters. Then the minimum is exponentially distributed with parameter, and applies

References and Notes

  • Stochastics
  • Random variable