Wilcoxon signed-rank test

The Wilcoxon signed-rank test is a nonparametric statistical test. She checks of two paired samples the equality of the central tendencies of the underlying ( connected ) populations. In the application he completed the sign test because he (ie the sign ) of the differences, but also the height of the differences between two paired-samples into account not only the direction. The test is an alternative to the Student's t-test if no normal distribution can be adopted for the underlying population.

The Wilcoxon signed rank test was from a chemist and statistician Frank Wilcoxon (1892-1965) and proposed by the textbook by Sidney Siegel - Nonparametric Statistics for the Behavioural Sciences - popular.

Hypotheses and conditions

For the test with respect to the two median and there are three possible hypothesis pairs:

A prerequisite is that the sampling variables

Are independent, identically distributed and symmetric. The last condition but is often neglected.

Test statistic

First, the rank of the absolute difference is calculated for the test statistic:

The test statistic is calculated as the minimum of the negative and positive rank sums:

In the event that one or more differences, there are two possibilities:

The test statistic is approximatively normally distributed for:

It should also be performed for even a continuity correction

For values ​​less than or equal to 50, the critical values ​​are also tabulated.

Ties in the ranks

In the event that ties in the ranks of the act (ie, multiple absolute differences get the same rank ), any difference in the mean values ​​of the corresponding ranks assigned (see example below).

Specifies the number of observations with the same rank as the observation pair, as is true

And for the approximation of

Leaving the correction factor away, so the test is too conservative, ie he decides to often for the null hypothesis.

Example

An example of its use: A statistically adept farmer wants to determine whether cattle prefer hay or straw. It divides an area into two parts, between which the animals can move freely back and forth. In one area, it offers the five cattle straw to, on the other hay. Every half hour, he noted how many animals are in which area resident; he receives n = 6 pairs of samples.

The result of his observations is a table, and it also calculates the differences of the values ​​:

Rank: The three 1er values ​​would have the ranks 1 to 3 show, but since they are equivalent to the average of their ranks is entered, ie (1 2 3 ) / 3 = 2 In the 5-series values ​​as: (5 6 ) / 2 = 5.5.

Then the differences in order of magnitude are ( the sign is not taken into account ); and any difference is assigned a rank - the biggest difference is given the highest rank. If there are several differences equal, each value of the average rank is assigned.

The rank sum of the positive differences is and is the rank sum of the negative differences, ie

Two-sided test

When two-sided test with

Can not be rejected null hypotheses at the significance level respectively. because

  • From the table above is obtained for and a critical value of. Since the test value is not less than the critical value, the null hypothesis can not be rejected or
  • From the table above is obtained for and a critical value of. Since the test value is not less than the critical value, the null hypothesis can not be rejected.

One-sided tests

Even with the one-sided test with

Can not be rejected null hypotheses. because

  • From the table above is obtained for and a critical value of. Since the test value is not less than the critical value, the null hypothesis can not be rejected or
  • From the table above is obtained for and a critical value of. Since the test value is not less than the critical value, the null hypothesis can not be rejected.

Approximation to the normal distribution with the two-sided test

Calculates man - as an approximation - from the normally distributed z- value:

From the standard normal distribution table obtained for the two-sided test

  • For critical values ​​of. Since the test value is in the interval, the null hypothesis can not be rejected.
  • For critical values ​​of. Since the test value is not in the interval, the null hypothesis can be rejected.

In order for the cattle to have a 10% level of significance a preference for one of the two varieties.

This seems to be a contradiction to the result from the exact two-sided test. However, the calculated using the formula given z- value is only an approximation and only for a sample size is reliable!

For the approximation of two-sided test, it does not matter whether the value in the formula, or (or the minimum of the two) is employed, since it follows

That the trial decision would be the same.

Comparison with the sign test

Five samples carry a positive sign ( ), a negative (-). According to the table of critical values ​​( MacKinnon, 1964 ), one can in this example, only of p <0.5 go (ie, less than 50 percent probability of error ). If all six samples of the same sign, p would be between 0.02 and 0.1 - here was thus clearly demonstrated that the method of Wilcoxon especially in smaller samples circumferences delivers useful results.

820082
de