General linear model

The ( General ) Linear model is one of the most widely studied (mathematical) models in statistics. Many statistical methods such as mean value comparisons and analysis of variance methods, correlation and regression analysis can be seen as special cases of linear models.

Model Description

Prerequisite for the application of such models in statistical practice is the assumption that a linear relationship between the observed data and the known predictors is. The methods of statistics ( is prominent especially the method of least squares) then provide purely quantitative results about the specific relationship between observations and influences.

In order for such models to be statistically observed at all, is also assumed that the data can not be directly observed, but are fraught with errors. Formal general linear models can be created by matrix equations of the form

Pose, it is

The vector of the dependent variables,

The matrix of independent variables,

Is the vector of regression coefficients of the variables described with X and

The vector of the disturbance.

Requirements

The only requirement on the linear model, that it is " real" model describes up to the error term. It is not precisely specified in the rule, what kind of error; example, it can result from additional factors or measurement errors. However, it is assumed as a prerequisite, that its expected value (in all components) is 0. This assumption means that the model

Is generally considered to be correct and the observed deviation is considered random or resulting from negligible external influences.

About this fundamental assumption addition, basically all distributional assumptions on allowed. Typical is the assumption that the components of the vector are uncorrelated and have the same variance, which results using classical methods such as the method of least squares estimators for simple and. Will also assume that the vector is multivariate normally distributed, it can be shown further that the two estimators solutions of the maximum likelihood equations ( Gauss - Markov ). In this model, the independence of the error is then the same as that of the.

In reality, there are often situations in which the assumption of identical and independent normally distributed errors is not tenable. This is the case if some of the independent variables, and thus the errors are partially correlated. This necessary deviation from the assumption of independence brings significant methodological problems, since some of the common assessment methods are no longer applicable.

Target

Using methods of regression analysis, in many cases from the data meaningful estimates and limit theorems for derived. Whether there is in fact a linear relationship between and the matrix is not examined. Linear models always possible to " write down ", only: whether they are really suitable for the particular case must first be clarified theoretically. In most cases, this test is not performed - in certain situations there are no information on the structure of the relationship before, in other situations, a linear model is chosen because of the relatively simple mathematical treatment.

The question of the quality of the linear relationship between the observed data and the regressors is usually answered with the help of the (corrected) coefficient of determination. With its help can be clarified, what proportion of the variability can be explained by the chosen regressors in the model. If this measure is small, usually more regressors are added.

Variants

Linear models can be extended, that no fixed design matrix is studied, but also this is random subject themselves to that effect. The investigation methods are changing in this case is not substantial, but are much more complicated and thus computationally expensive.

Others

Linear statistical models can be represented with a corresponding transformation in the context of a general regression equation. Corresponding specific linear methods may be derived from the general form ( new ).

49790
de