Sampling (statistics)

A random sample (also selection probability, random sample, random sample, random sample ) is a random sample from the population that is drawn using a special selection process. In such a random selection process, each element of the population of a specifiable probability ( greater than zero ), to access the sample. Are only with random samples, strictly speaking, the methods of inductive statistics applicable.

  • 2.1 Sample Size
  • 2.2 Example ( choice)
  • 2.3 Example ( materials testing )

Mathematical definition

A sample is first and foremost a subset of a population. For a random sample of additional conditions being imposed:

  • The elements are drawn randomly from the population and
  • The probability that an item is dragged from the population is, can be specified

Furthermore, a distinction between a full and a simple random sample:

  • Each element of the population has the same chance to get into the sample.
  • Each element of the population has an equal chance to get into the sample, and
  • The draws from the population take place independently.

An unrestricted random sample is obtained, for example, at a sampling without replacement and a simple random sample, for example, at a sampling with replacement.

Examples

Literary Digest disaster

The Literary Digest disaster of 1936 shows what can happen when there is no random sample is drawn from the population. A distorted sample led to a completely wrong choice prognosis.

Election survey

A survey of voters after they come out of the voting booth, in terms of their voting behavior is an unrestricted random sample (if no respondent refuses to answer ) with respect to the voters. However, it is not ( unrestricted ) random sample with respect to the electorate.

Bag Check

The retail repeatedly complained that caused by theft of goods by its own staff great damage. That's why lead larger supermarkets including a bag control by when employees leave the supermarket. Because a full bag check of all employees would be too expensive ( and would have to be this probably paid as working time ), the staff go after you leave the supermarket by the personal output of a lamp over. It shows computer- controlled by either a green light ( employee is not controlled ) or a red light ( employee will be controlled). This selection is then a simple random sample.

Random sampling in mathematical statistics

In mathematical statistics, random samples are the basis for the inference from the sample to characteristics of the population. A concrete sample is then viewed as realizations of random variables. These random variables are referred to as sampling variables and indicate the probability at the th contraction with a specific selection a certain element of the population can be drawn.

If a simple random sample drawn, it can be shown that the sampling variables are independent and identically distributed ( iid abbreviation, from the Engl. Identically distributed independent and ). That the type of distribution and the distribution parameter of all sampling variables are equal to the distribution in the population ( identically distributed ), and due to the independence of draws the sampling variables are independent of each other ( independent).

In many problems in the inductive statistics is assumed that the sampling variables iid are.

Dependent and independent samples

For analyzes with more than one sample is necessary to distinguish between dependent and independent samples. Instead of a dependent sample is also called related-samples or paired samples.

Dependent samples usually occur with repeated measurements on the same object of investigation. For example, the first sample of persons prior to treatment with a particular drug, and the second sample of the same persons after treatment, ie the elements of two (or more) samples can be assigned in pairs to each other.

For independent samples is no relationship between the elements of the audit. This is for example the case if the elements of the sample are each from different populations. The first sample consists for example of women, the second sample of men, or when people randomly into two or more groups are divided.

Formally this means for the sampling variables ( with the th object of investigation and the th measurement ):

  • Independent Samples: All samples variables are independent.
  • With dependent sampling: The sampling variables in the first sample are independent, but there is a dependency between the sampling variables, since they are collected on the same object of investigation.

Single-stage random sampling

A pure (also: easy ) or unrestricted random sample can be described by means of an urn model. For this purpose, a fictional vessel is filled with balls, which are then drawn at random: sampling with replacement, a simple random sample, sampling without replacement results in an unrestricted random sample. An urn model can be so different random experiments, such as a lottery simulate.

Sample size

The sample size ( often called sample size) is the number of needed for a test sample of a population to determine statistical parameters with a given accuracy by estimation. The sample size but is often defined by standards or experience.

If the unknown parameters in the population, then an estimator as a function of the sampling variables is constructed. The expected value of the random variable is usually, and we have:

With a point estimate of the unknown parameter, the absolute error and the probability that a realization in the central fluctuation interval takes.

The absolute error is the same, ie

Which usually applies distribution type of dependent and for the variance. The following table gives an estimate of the sample size for the unknown population mean and the unknown proportion.

Example ( choice)

A party has achieved in a poll shortly before the election 6%. How large must a voter survey on election day have with security so that the true share value with an accuracy of can be determined?

Or more closely

That is, in the somewhat more accurate estimate of the sample size for the share value is calculated that in 2167 voters still need to be interviewed to get the election results with an accuracy of 1%. The chart on the right shows the sample sizes are needed for a certain estimated share value and a given security.

Example ( materials testing )

In materials testing a sample size of 10 per 1,000 parts produced is quite common. He is, inter alia, of the security criticality of the component or the material dependent. The destructive tests such as the tensile test is trying to keep the amount of testing and thus the sample as small as possible. Non-destructive testing - for example, in image processing systems for the completeness check - % check is carried out to detect errors in production as quickly as possible - a 100 is common.

Multi-stage random sample ( also complex random selection )

In particular, the following selection of meaning, where the first two are referred to as two-stage selection process:

  • Stratified random sample ( stratified sample): The elements are classified according to a particular characteristic in groups ( subsets ). Within each of these groups then a pure random sample is drawn. Here is drawn in at least two planes. For example, classes are drawn according to a previously established procedure at the first stage. Then the objects of investigation (here students) drawn on the second stage. As a process, both the pure random sample, as well as a weighted procedure comes into question.
  • Lump - sample (cluster sample): First, a ( relatively small ) is drawn pure random sample. After that, all elements contained in the towed units are included in the sample. A classic example is the survey of whole blocks or school classes. First to be interviewed school classes are determined by random selection. Then all the students present in the classes be questioned. In the cluster sample of the so-called lump effect occurs. It is all the greater, the more homogeneous the elements within the groups, heterogeneous groups with each other.
  • Stepped random sample ( staged sample): It is often preferred for reasons of cost reduction and time savings of stratification. Also it is advisable to graduation, when a collection of all cases ( research objects, features, etc. ) of the population does not exist and can therefore represent a simple random sample does not perform ( eg a study on the basis of texts. Given that not all texts or recorded electronically are. available, resulting from the exploration of the respective archives high costs. through a graduation this can be avoided ). In essence, the approach of stepping on the stratification oriented by:
  • Random route method

Application models

  • ADM design as a combination of layering and gradation
299269
de