Lorenz curve

The Lorenz curve (also: Lorenz curve ) was developed in 1905 by the American statistician and economist Max Otto Lorenz ( 1876-1959 ). It provides statistical distributions graphically and illustrates it as Ungleichverteilungsmaß the degree of disparity ( inequality), respectively, relative concentration within the distribution; therefore it is also dubbed as Disparitätskurve. Official statistics use the Lorenz curve, in order to clarify the distribution of income in a country.

  • 3.1 Extreme cases
  • 3.2 Continuous and discrete binned data
  • 7.1 Discrete case
  • 7.2 Continuous / Continuous case
  • 8.1 Economics
  • 8.2 Ecology

Structure and Explanation

The Lorenz curve is a function in the unit square of the first quadrant. It represents, which account for the shares of the total sum feature on which portions of the base set with feature beams. So be on the axis (abscissa) the shares of all the carriers of the trait (for example: the population) on the axis (ordinate ) the share of the total sum feature ( for example: income ) removed. First, the data are sorted in ascending order for it - starting with the lowest proportion of the feature sum - and then cumulated ( " summed "). This produces the characteristic "belly" of the Lorenz curve below the diagonal, which reflects the degree of inequality. Each point on the Lorenz curve represents a statement like " the bottom 20 % of households get 10% of the total income " (see: Pareto principle ). A perfect income equal distribution would be an income distribution in which all individuals have the same income. In this case, the lower the company 's income would always have. This can be graphically represented by a straight line; they are called perfect equipartition line ( line of perfect equality ). In contrast, the perfect inequality would be a distribution in which one person has all the income and all other persons no income. In this case, the curve would be for all and this curve is as perfect inequality line ( line of perfect inequality ) refers.

The Gini coefficient is the ratio of the area between the perfect uniform distribution line and the observed Lorenz curve to the area between the uniform distribution line and the unequal distribution line. The Gini coefficient is thus a number between 0 and 1, the higher it is, the more unequal is the distribution.

Calculation

Discrete event

The Lorenz curve is as defined in sections (that is, as a polygon ) is defined by the points linear curve. Are the shares of all the carriers of the trait and the share of the total sum feature ( on these terms as "Structure and Explanation"), the coordinates of the points are defined by:

And

Continuous / Continuous case

Generally

The Lorenz curve can be commonly represented by a function, being plotted on the abscissa and on the ordinate.

For a population the size of a sequence of values, which are indexed in ascending order, the Lorenz curve is the continuous, piecewise linear function connecting the points (,), connects with, and is for:

It also called Lorenz asymmetry coefficient.

For a discrete probability function are, from, the points indicated with Non-/Nicht-Null-Wahrscheinlichkeiten after ascending order. The Lorenz curve is the constant defined in sections, linear function, which, connects the points (,) to each other, is and shall:

For the Laplace distribution, that is, for all, we obtain exactly the above formulas for and.

For a probability density function of the cumulative probability distribution function, the Lorenz curve is defined by:

A cumulative distribution function with the inverse of the Lorenz curve is given by:

The inverse function would not exist because the cumulative distribution function has jump discontinuities ( discontinuities ) or intervals of constant values ​​. The previous equation remains valid if we define more generally by the following formula:

Guest Wirths definition

Am considered a non-negative random variable with the corresponding normalized Quantilsfunktion. After Joseph Lewis Gastwirth the picture is

Referred to as ( continuous) Lorenz curve from or to the distribution of.

Properties

The Lorenz curve has the following properties:

  • It always starts at the origin and ends at point.
  • The derivative of the curve is monotonically increasing, and therefore the curve itself, is convex and is below the diagonal.
  • The Lorenz curve is continuous on the open interval (0,1), even piecewise linear in the discrete case.

The Lorenz curve is not defined for a mean of the probability distribution of zero or infinity.

The Lorenz curve for a probability distribution is a continuous function. But Lorenz curves of discontinuous functions can be formulated as a limit ( limit ) of the Lorenz curves of probability distributions - such as perfect inequality line ( line of perfect inequality ).

The data of a Lorenz curve can be summarized by the Gini coefficient and the Lorenz asymmetry coefficient.

The Lorenz curve is invariant under positive scaling. If a random variable is, so has the random variable for any positive number is the same Lorenz curve as, where is understood of course that of the associated distribution under the Lorenz curve of a random variable.

The Lorenz curve is not invariant under translations, that is under a constant shift of values. Is a random variable with a Lorenz curve and the middle is then obtained for the Lorenz curve of the shifted random variable, with a fixed constant is the following formula:

For a cumulative distribution function with the mean and the ( generalized ) inverse is true for each of

  • If the Lorenz curve is differentiable, then:
  • If the Lorenz curve is differentiated twice, the probability density function exists in this point and:
  • If is continuously differentiable, then the tangent parallel to the perfect equality is precisely the point. This is also the point at which the Gleichheitsdiskrepanz, the vertical distance between the Lorenz curve and the perfect equality of Currently, is greatest. The size of the discrepancy is equal to a half of the relative mean deviation:

The Lorenz curve of a random variable is mirrored at the point when going from to, that is, with above notation introduced:

Extreme cases

The more even the characteristic sum is distributed among the carriers, the more the Lorenz curve approaches the diagonal of. In the extreme case of economic equal distribution (statistical one-point ), it coincides with her.

In the case of larger disparity, the curve moves down towards the abscissa. For the extreme case of maximum inequality ( a characteristic feature of the entire sum carrier and intended use ) runs the Lorenz curve as a polygonal line on the abscissa to and from there to the point.

Continuous and discrete binned data

What form the Lorenz curve takes exactly depends on the nature of the data of the feature. Basically continuous data ( see example image above) to distinguish them from discrete data. In the second case, the Lorenz curve is a polygonal line passing through the points.

Measurement of the relative concentration ( disparity )

The Lorenz curve provides a graphical way to look at the extent of disparity within a distribution. The more the curve bulges downward, the greater the disparity (see the extreme cases). In the event that intersect two Lorenz curves, however, can be based on the graphics not unambiguously determine which has the greater disparity. The measurement using graphic is too imprecise. Precise values ​​for the metrics provide Gini coefficient and coefficient of variation. The Gini coefficient stands in direct relation with the Lorenz curve: it is twice the area between the Lorenz curve and the diagonal in the unit square.

Example table for discrete binned data

Have a data collection for 5 classes that are named with an index, the relative frequencies result (proportion of carriers of the trait of the class of all the carriers of the trait ) and the proportions of the feature sum attributable to the class in the table below. From this we determine

  • Cumulative ( relative frequency )
  • Cumulative ( disparity ).

Explanation:

The Lorenz curve is created by plotting on the abscissa on the ordinate, and connecting the points by a polygonal line.

Articles for Pareto distribution containing a further example of a Lorenz curve.

Set of Rothschild and Stiglitz

Given two distributions and with the Lorenz curve of lies above the Lorenz curve of Then and only then for any symmetric and quasiconvex function

Conclusion: When two cut Lorenz curves, depends on the choice of the respective symmetric and quasiconvex function which can be described as the difference with the larger of the two curves.

Length

As Disparitätsmaß ( measure of the relative concentration) also the Lorenz curve length can be argued. The range of values ​​is valid for the domain

Discrete event

This can be - as the name already suggests - derived from the discrete Lorenz curve by the lengths of the sections are cumulative. For the length of the discrete Lorenz curve is valid:

For equal distribution is the case of absolute concentration on only a single feature value is

Continuous / Continuous case

The length of the continuous / continuous, differentiable Lorenz curve [ between points and ] is calculated from the first derivative of the Lorenz curve function as follows:

With.

Applications

Economics

In economics, the Lorenz curve is used for the graphical representation of the cumulative distribution function of the empirical probability distribution of the assets; it is a graph showing the degree of distribution is assumed for the lower of the values. It is often used to represent a distribution of income, which is illustrated for the lower of households, how big have the share of total income in it. The percentage of households is plotted on the abscissa, the proportion of income on the ordinate. It can also be used to represent the distribution of income. In this sense, many economists consider the Lorenz curve as a measure of social inequality (social inequality measure ). It was developed by Max O. Lorenz showing the inequality of income distribution in 1905.

In addition to the illustration of the distribution of income, the Lorenz curve is also used for presentation of market power or spatial distributions (see: segregation ).

Another use of the Lorenz curve in the logistic ABC analysis, in which the Lorenz curve illustrates the distribution of goods ordered by classification property ( for example, value) and quantity consumed.

The Lorenz curve can also be used for business models - for example, in the consumer finance to the real delinquency in payments ( delinquency ) of the consumer with the worst predicted Risiko-/Kreditscores.

Ecology

The concept of the Lorenz curve is useful for the description of the disparity between the numbers of individuals in ecology and biodiversity research studies one uses it by facing the cumulative proportions of species, the cumulative proportions of individuals.

Concentration and disparity

The disparity ( Lorenz curve ) and ( absolute ) concentration (concentration curve) are related measures, but describe different things. While the Lorenz curve is, what proportion of the feature sum ( ordinate) at which shares in the group of carriers of the trait (abscissa ) account represents the concentration curve, which shares the feature sum (ordinate ) is attributable to which carriers of the trait (abscissa). This means that the Lorenz curve shares with shares compares the concentration curve shares with absolute numbers (abscissa). Thus, high disparity and low concentration or high concentration and low disparity occur simultaneously. The following example illustrates the issue:

Suppose firms share a market. The table lists the cases of high and low disparity and concentration with (fictitious) absolute numbers are played:

241936
de