Frequency distribution

A frequency distribution is a method of statistical description of data ( measured values ​​, characteristic values ​​). Mathematically, a frequency distribution is a function that specifies for each pre- arrived value, how often this value has occurred. One can describe such a distribution as a table, as a graph or a model of a functional equation.

The frequency distribution is in the descriptive statistics, what is the probability distribution in probability theory; the latter has a series of mathematical functions that are used to align and analysis of the frequency distribution ( such as Gaussian distribution ).

Method

The amount of data (measured values, survey data ) is the first unordered original list. First, it is ordered or sorted. From the parent primary list ( ranking) can already be median, refer span ( statistical dispersion ), quantiles and interquartile range and estimate the standard deviation.

Then we summarize the same values ​​and write down each value, how often it occurs, so its absolute frequency. We obtain the absolute frequencies to the total number of values ​​, called the sample ( sample size ), we obtain the relative frequencies. We now have an ordered set of pairs of values ​​(characteristic value and associated relative frequency ), a so-called ranking.

If we add - the smallest characteristic value starting - the relative frequencies and assign each feature value which achieved up to then sum ( including his own contribution ), so we obtain the cumulative distribution. It indicates for each feature value, how large is the proportion of values ​​less than or equal to the corresponding characteristic value. The value starts with 0 and goes to 1 or 100 percent. If you compare the table graphically, resulting in a weak monotonically increasing curve, usually in a stretched S-shape. There are numerous attempts to reproduce real distribution totals are due to functional equations approximately. The distribution of sums, depending on the feature values ​​are the simplest type of representation of a frequency distribution.

The bill further requires a classification of the characteristic values ​​into classes. For this purpose, one divides the range of values ​​occurring in, for example, 10 or 20 usually of equal width classes ( the rare values ​​at the edges (see " outliers" ) are sometimes grouped together in larger classes). It then passes to the density functions that are the derivative of the cumulative distribution function according to the characteristic value in the case of a continuous distribution. Furthermore, it can determine the frequency not only by counting, but also for example by weighing. We then obtain a mass distribution instead of a number distribution. In principle, any additive quantity for measuring the frequency is.

If a random sample differs greatly from the normal distribution ( bell curve ), the data may be biased by undetected influences, selection effects or a trend. Different way out is statistical tests or analysis of variance. If the sample size in a superposition of several subsets ( age distribution, occupations, groups ), the frequency distribution instead of a maximum can be two or mehrgipfelig.

378318
de