Sign test
The sign test or sign test is a nonparametric statistical test. The sign test is a binomial test. With its help, distribution hypotheses can be tested in one- and two-sample problems. The sign test is also used when only ordinal data level is present.
- 2.1 assumptions
- 2.2 hypotheses
- 2.3 Procedure
- 3.1 Exact distribution
- 3.2 approximation by the normal distribution
One-Sample Problem
Test for median
With the help of the sign test hypotheses can be tested on the median of a distribution.
Test for symmetry
The sign test can also be used as a test for symmetry of a distribution: Is the true arithmetic mean of the population is known or assumed to be an estimator of true value, it can be checked whether the arithmetic mean with the median coincide, ie if 50% of the possible values are the right and left 50% of the arithmetic means, and thus, if the distribution is symmetrical.
Test for mean
Referring again to the symmetry of the distribution, the population mean, is equal to the median of the population and of the sign test has the opportunity to test hypotheses about the arithmetic mean of the population.
Assumptions
- The observations are independent.
- The underlying random variable is distributed continuously in the population.
- Since size comparisons between observations and hypothetical Median be performed, the feature being examined must have been compiled at least on an ordinal level.
Hypotheses
When tested on two sides, the hypothesis to be tested that the median in the population is equal to a hypothesized value. The probability that a value is greater than the hypothetical parameters should then be 0.5 if it really corresponds to the median. Being tested on one side, it is checked if the median is greater than or less than the hypothetical value, ie if the probability that a value is greater than the hypothetical parameter is greater or less than 0.5.
Other equivalent formulations of the hypotheses are possible. The test principle is expandable by adapting the hypotheses and the parameters of the distribution of the test statistic to arbitrary quantiles. When tested on a different quantiles than the median of the hypothetical probability ( 1/2 here ) adjusted accordingly (see binomial test ).
Procedure
The sample values that are greater than the hypothetical median, get a " " is assigned; Values which are smaller, a "-". That is, the sample variable is mediandichotomisiert. The number of positive sign is counted and used as a test statistic.
Two-sample problem
The sign test is used when two related samples are to be investigated. Paired samples are present, if the observations of both groups in pairs depend on each other, for example, if the health of the same person before and is being investigated after a treatment. Appropriate sign (" - " " " or ) are from the size comparison between the values of each pair produced.
The sign test to test for equality of the distribution function of two random variables from associated populations continuously distributed. The medians of the samples differ significantly, the distribution in the population is different.
Assumptions
- The observation pairs must not depend on each other, ie the pair of values must be independent of the pair of values .
- The underlying random variables are distributed continuously in the population.
- As few as size comparisons between the observations are carried out, the feature being examined must have been compiled at least on an ordinal level.
Hypotheses
If both populations have the same median, P ( X11 > X12) = P (X11 < X12). The following hypotheses can be tested with the sign test:
Procedure
The pairs of values of the samples considered in those who get a " " is assigned; Value pairs, for which it holds a "-". The number of positive sign is counted and used as a test statistic.
Test statistic
Exact distribution
The test statistic is the number of positive comparisons ( differences of the values or ranks):
With
For the one-sample problem, the values of the second sample by the hypothetical median must be replaced. In the null hypothesis, the sum of the positive differences is a binomial distribution with because the median is the 50 % quantile. n ' refers to the after treatment of Ties (zero differences, rank bonds) remaining sample size. For validity of the distribution of the test statistic Nullyhypothese is symmetrical.
Approximation by the normal distribution
With the binomial distribution approaches a normal distribution with. A rule of thumb for a useful approximation is .. is the case of the null hypothesis
So if or applies, is the z - standardized size
Approximately a standard normal distribution and the critical values for the test decision can be read from the table of standard normal distribution.
Bonds ( zero differences)
Since continuous random variables will only be collected discreetly usually bonds can occur. Are the two-sample problem, the values of observations from the first to the second sample unchanged or are in the one-sample problem for several values equal to the median to zero differences or bonds yield ( Ties). However, a binomial test, only two categories ( here and -) treat. This raises the question of how to rank bonds can be treated. Possible methods are:
- Observations with rank bonds are eliminated, ie the sample size is reduced.
- The observations are equally assigned to the groups. With an odd number of bonds an observation pair is eliminated.
- Observations are each provided with a probability of 0.5 of either group ( or -) assigned.
- Zero differences get the rarer sign (very conservative approach ).
Example of a two-sample problem
A school board would like to investigate whether the school performance of students with a new learning method have (eg e-learning) improved. The school performance of a random sample of 43 students will be measured by an appropriate assay. After that, the students are confronted with the new learning method. After the confrontation, the school performance of the same students refile. The school board operates with the obtained observations by a right-sided sign test:
To evaluate the frequencies of the sign ( , -, =) to determine the differences:
At 25 students, the services have improved. Eleven students were poor and at seven they stayed the same. Can we conclude from this result that the new learning method in the population has a positive effect?
Bonds
The sample size is reduced by the number of bonds.
Binomial
When using the binomial distribution as the test results on a ( maximum ) level of significance of 0.05, a critical value of 23 (0.95 - quantile of the binomial distribution, p- value = 0.01441 ). Since 25 > 23, the null hypothesis is rejected ( no improvement). The school authorities can close so after such a result that e -learning has a positive impact on school performance.
Normalverteilungsapproximation
The critical value of the standard normal distribution for α = 0.05 is 1.6449 (0.95 - quantile of the standard normal distribution).
The approximation of the distribution of the test statistics by the normal distribution is
With an associated p- value, ie the probability that the test value obtained or greater occurs under the null hypothesis of. The school authorities can close at a significance level of 5 % also here that e -learning has a positive impact on school performance.