Regression discontinuity design

The regression discontinuity analysis ( engl. regression discontinuity design) is a method of statistical inference and econometrics, applied to in order to identify the causal effects of the change in one variable to the change of other variables. The basic idea is to make use of a discontinuity or irregularity in the observed control variable, which results in an almost random allocation to the treatment or control group. The regression discontinuity analysis includes how the instrumental variables approach and Difference - in-differences to the methods which exploit the so-called "natural" or "quasi- experiments".

Idea

In many situations in which causal effects are to be investigated and quantified, a correlation between the explanatory variable and the error term, leading to endogeneity and thus inconsistency of the least squares method is: even for large samples, the least squares estimator not be undistorted. The discontinuity regression analysis can be used to overcome this problem.

The basic idea of ​​regression Diskontinuitätsanalyse is to find a discontinuity in an observable control variable that has an influence on whether an individual receives the treatment or not. This can best be illustrated with an example. In a study published in 1999, economists Joshua Angrist and Victor Lavy examined the effect of class size on the achievement of students. They used the "rule of Maimonides ," which is still used today in Israel, to regulate the class size in public schools. According to this rule, a class can have a maximum of 40 students. Does it more, so a second class must be formed. Here is a strong discontinuity between the number of students a year group at a school and Class Size: Does the school have 39 students, so there is a class of 39 students; the school has 40 students, so there are two classes with 20 students. Whether a school now has 39 or 40 students, is not completely under the control of the individuals involved, but is at least partially due to chance. For this reason it can be considered as an exogenous variation that allows a consistent estimation of the effect of class size on student performance.

Must be differences in the RD analysis between the classic " sharp " RD analysis ( sharp regression discontinuity design ) and the " fuzzy" ( fuzzy) RD analysis. In the sharp RD analysis is the "treatment" of a deterministic function of the underlying control variable, ie the control variable determines the treatment perfectly (as in the example above). In the fuzzy RD analysis, the control variable does not determine the treatment perfect, but affects their probability or their expected value.

Mathematical Background

Sharp RD- analysis

The underlying model is

Where D is an indicator variable that indicates whether a person is "treated" is or not. In the above example would be D " is in a small class ," X would be " Number of students at the school ." X = c is the point at which the discontinuity lies, so in the above example Then, 40

Assuming that is continuous, shall also apply to the left-sided limit:

( which is supposed to represent the limit on the left of the discontinuity ) Then

Can be estimated that expectation values ​​, for example, by the data are rescaled so that c is the zero point, and then it will be carried out on the left and right are two least- squares estimates. The difference between the expected values ​​can then be calculated as the difference of the two constants of the least- square estimations. Alternatively, an estimate by a single least square estimate, with appropriate interaction terms is possible.

If the effect of the treatment is different for different individuals, it can be shown that the sharp - RD analysis, the average treatment effect ( average treatment effect) indicates.

The underlying model is again

However, it is now

Wherein Z has no direct effect on Y. Then can be calculated

And consequently

Can be estimated the fuzzy RD analysis as an instrumental variable estimation, with Z as an instrument for D. This first D is regressed on Z. The estimated values ​​thus obtained are then used in a second regression as explanatory variables for Y ( see also Mathematical Background to instrumental variables ).

Benefits

The application of the regression analysis discontinuity has numerous advantages. When the observed individuals do not affect the allocation variable (X in the above example ), the allocation into treatment and control groups is random and allows a procedure analogous to an actual based on a random selection experiment without having such performed. In fact, this is sufficient even if the individuals do not have perfect control over the allocation variable. Even if the " subject " X may determine, to some extent, the eventual distribution to the point of discontinuity is random. This is a particular advantage of RDD to other quasi-experimental research approaches, where the quasi- random allocation must often be taken and defended by means of verbal arguments.

RDD is also an important part of a whole quasi-experimental research agenda, the economy is also known as "credibility revolution" ( credibility revolution ) known in Angewandte. Representatives of this agenda to emphasize that the increased use of experimental and quasi-experimental research approaches has led to credible research results.

Disadvantages

One potential problem with the use of RDD estimators is the risk of Misspezifikation the underlying functional form. Follow the "true " underlying model for example no linear relationship, it would be an estimate as described above distorted in general and not unbiased. Possible remedies for this are the insertion of higher polynomials (for example :) or resorting to nonparametric estimates.

As part of the " quasi-experimental " research method RDD is also home to the criticism of this subject. Christopher Sims looks RDD and related research approaches as " useful, but [ ... ] No Allheimittel " while Angus Deaton feared the attention of researchers could shift to the effect that the feasibility of study towards their relevance is important.

History

The regression discontinuity analysis was first used in 1960 by the psychologist Donald L. and Donald T. Campbell Thistlewaite. However, in economics and econometrics, it was not until much later, in the late 90s and early 2000s wider application. The first major studies of examination included the previously mentioned article by Angrist and Lavy and an article by Wilbert van der Klaauw from the year 2002. Since then the RD analysis has become a widely used tool in empirical economics.

676600
de