Robust statistics

Robust estimation method is a notion of inferential statistics. An estimation method or test procedure is robust if it is not sensitive to outlier (value outside an expected range of values ​​based on a distribution ).

The classical estimation methods that were developed in the first half of the 20th century, often tend to yield misleading results in the presence of outliers in the sample. Therefore, a robust estimation method is based on the mass of data and integrates an outlier analysis to reduce the influence of model variations and let him aspire with increasing deviance zero.

The development of robust estimators to improve the efficiency of estimation methods since the 1980s, an important research issue in mathematical statistics. The robust methods include, for example the RANSAC algorithm and method, which have a high resistance to break point.

A simple robust estimation method represents the (empirical ) Median, which you can use instead of the arithmetic mean to estimate the expected value of a symmetric distribution. The empirical median is obtained by sorting the observations by size, and then chooses the order according to the mean observed value as an estimate. An example: Let there be a certain number of measurements carried out in order to experimentally determine a physical quantity (such as the gravitational constant ). It is assumed that the measurement errors that occur are unsystematic and can go in both directions, so the readings are sometimes too big, sometimes too small; formally precise: independent and identically distributed observations with symmetric distribution and the true value of the quantity to be determined as the expected value. Occasionally, there is now some measured values ​​deviate significantly from the other ( "outliers", the model variations described above); they are usually attributed to errors in the execution of the experiment ( "blurring " of the equipment, " prescription " or similar). Although extreme deviations rather point to an error and therefore such observations should rather have less influence on the result, they affect the arithmetic mean strong; the impact is even greater, the clearer is the deviation. The median, however, is not sensitive to such outliers, that is " robust". If there are no outliers, but it provides the same number of measured values ​​is generally a less precise estimate, because " the little ones" of the estimate by only one - is determined observation - namely the mean.

688784
de