Sign test

The sign test or sign test is a nonparametric statistical test. The sign test is a binomial test. With its help, distribution hypotheses can be tested in one- and two-sample problems. The sign test is also used when only ordinal data level is present.

  • 2.1 assumptions
  • 2.2 hypotheses
  • 2.3 Procedure
  • 3.1 Exact distribution
  • 3.2 approximation by the normal distribution

One-Sample Problem

Test for median

With the help of the sign test hypotheses can be tested on the median of a distribution.

Test for symmetry

The sign test can also be used as a test for symmetry of a distribution: Is the true arithmetic mean of the population is known or assumed to be an estimator of true value, it can be checked whether the arithmetic mean with the median coincide, ie if 50% of the possible values ​​are the right and left 50% of the arithmetic means, and thus, if the distribution is symmetrical.

Test for mean

Referring again to the symmetry of the distribution, the population mean, is equal to the median of the population and of the sign test has the opportunity to test hypotheses about the arithmetic mean of the population.

Assumptions

  • The observations are independent.
  • The underlying random variable is distributed continuously in the population.
  • Since size comparisons between observations and hypothetical Median be performed, the feature being examined must have been compiled at least on an ordinal level.

Hypotheses

When tested on two sides, the hypothesis to be tested that the median in the population is equal to a hypothesized value. The probability that a value is greater than the hypothetical parameters should then be 0.5 if it really corresponds to the median. Being tested on one side, it is checked if the median is greater than or less than the hypothetical value, ie if the probability that a value is greater than the hypothetical parameter is greater or less than 0.5.

Other equivalent formulations of the hypotheses are possible. The test principle is expandable by adapting the hypotheses and the parameters of the distribution of the test statistic to arbitrary quantiles. When tested on a different quantiles than the median of the hypothetical probability ( 1/2 here ) adjusted accordingly (see binomial test ).

Procedure

The sample values ​​that are greater than the hypothetical median, get a " " is assigned; Values ​​which are smaller, a "-". That is, the sample variable is mediandichotomisiert. The number of positive sign is counted and used as a test statistic.

Two-sample problem

The sign test is used when two related samples are to be investigated. Paired samples are present, if the observations of both groups in pairs depend on each other, for example, if the health of the same person before and is being investigated after a treatment. Appropriate sign (" - " " " or ) are from the size comparison between the values ​​of each pair produced.

The sign test to test for equality of the distribution function of two random variables from associated populations continuously distributed. The medians of the samples differ significantly, the distribution in the population is different.

Assumptions

  • The observation pairs must not depend on each other, ie the pair of values ​​must be independent of the pair of values ​​.
  • The underlying random variables are distributed continuously in the population.
  • As few as size comparisons between the observations are carried out, the feature being examined must have been compiled at least on an ordinal level.

Hypotheses

If both populations have the same median, P ( X11 > X12) = P (X11 < X12). The following hypotheses can be tested with the sign test:

Procedure

The pairs of values ​​of the samples considered in those who get a " " is assigned; Value pairs, for which it holds a "-". The number of positive sign is counted and used as a test statistic.

Test statistic

Exact distribution

The test statistic is the number of positive comparisons ( differences of the values ​​or ranks):

With

For the one-sample problem, the values ​​of the second sample by the hypothetical median must be replaced. In the null hypothesis, the sum of the positive differences is a binomial distribution with because the median is the 50 % quantile. n ' refers to the after treatment of Ties (zero differences, rank bonds) remaining sample size. For validity of the distribution of the test statistic Nullyhypothese is symmetrical.

Approximation by the normal distribution

With the binomial distribution approaches a normal distribution with. A rule of thumb for a useful approximation is .. is the case of the null hypothesis

So if or applies, is the z - standardized size

Approximately a standard normal distribution and the critical values ​​for the test decision can be read from the table of standard normal distribution.

Bonds ( zero differences)

Since continuous random variables will only be collected discreetly usually bonds can occur. Are the two-sample problem, the values ​​of observations from the first to the second sample unchanged or are in the one-sample problem for several values ​​equal to the median to zero differences or bonds yield ( Ties). However, a binomial test, only two categories ( here and -) treat. This raises the question of how to rank bonds can be treated. Possible methods are:

  • Observations with rank bonds are eliminated, ie the sample size is reduced.
  • The observations are equally assigned to the groups. With an odd number of bonds an observation pair is eliminated.
  • Observations are each provided with a probability of 0.5 of either group ( or -) assigned.
  • Zero differences get the rarer sign (very conservative approach ).

Example of a two-sample problem

A school board would like to investigate whether the school performance of students with a new learning method have (eg e-learning) improved. The school performance of a random sample of 43 students will be measured by an appropriate assay. After that, the students are confronted with the new learning method. After the confrontation, the school performance of the same students refile. The school board operates with the obtained observations by a right-sided sign test:

To evaluate the frequencies of the sign ( , -, =) to determine the differences:

At 25 students, the services have improved. Eleven students were poor and at seven they stayed the same. Can we conclude from this result that the new learning method in the population has a positive effect?

Bonds

The sample size is reduced by the number of bonds.

Binomial

When using the binomial distribution as the test results on a ( maximum ) level of significance of 0.05, a critical value of 23 (0.95 - quantile of the binomial distribution, p- value = 0.01441 ). Since 25 > 23, the null hypothesis is rejected ( no improvement). The school authorities can close so after such a result that e -learning has a positive impact on school performance.

Normalverteilungsapproximation

The critical value of the standard normal distribution for α = 0.05 is 1.6449 (0.95 - quantile of the standard normal distribution).

The approximation of the distribution of the test statistics by the normal distribution is

With an associated p- value, ie the probability that the test value obtained or greater occurs under the null hypothesis of. The school authorities can close at a significance level of 5 % also here that e -learning has a positive impact on school performance.

729769
de