Base rate fallacy

Prevalence as faults is called the error that occurs when the determination of the conditional probability of a random variable A is performed under a condition of B, regardless of the prevalence or a priori probability that A. The prevalence of A indicates the distribution of the population in question, and is also referred to as the base rate. The prevalence of error is therefore also referred to as the base rate error, as the base rate disrespect or as Base Rate Fallacy. For A and B events, but also properties come into question, the error is a common phenomenon in the interpretation of statistical correlation.

Calculation example

Following Lindsey / Hertwig / Gigerenzer.

The assumption that without a doubt a person on the basis of traces can be identified by a DNA test is based on two prevalence errors.

Assumptions

When a crime a DNA trace is found at the victim. There are only a few more instructions before:

How it is to evaluate if a randomly picked out of the 10 million person tests positive in the context of a DNA dragnet?

Follow

  • The test provides, if all 10 million people to be tested in an average of 100 cases a false positive result. About the conditional probability obtained
  • Of the 10 million people were tested positive in a DNA dragnet total of 110. The prevalence of the positive test is thus greater than the prevalence of the DNA profile.
  • But the result would be correct only in about one- eleventh of the cases. Although the probability that the genetic fingerprint is a positive test result, even is 1, the probability that after a positive test and the DNA profile is () according to the Bayes formula and the computed relative frequencies. Is thus assumed that a positive test on a randomly selected person is a good prediction for a match with the DNA profile of the track, a first prevalence mistake is made in the and were confused.
  • The author of the track is one of the 10 winners of the DNA profile. It is therefore necessary, but at the same time. Although the author of the track certainly has the DNA profile and so he would test positive with certainty, under the given conditions, the probability that any positive test case referred to the author, quite small: from 110 positive test cases, only 10 DNA have profile, and this is just one of Author:
  • In comparison, the probability that a randomly tested positive person is not the author of the track, very large: from 110 test cases, 100 are not carriers of the profile and 9 vehicle, but not the Author:

Evaluation

If therefore asserts that, sure would also be the originator of the DNA trace of a person who has been tested positive without initial suspicion, so there is a double prevalence error because both the prevalence of the DNA profile and that of the marker, the positive test results leads, be overlooked. There are so confused and each other. Although it is almost certain that the author of the track to be positive, it is unlikely that any positive test case, the originator is. By interchanging the high probabilities are embezzled by and.

The DNA test is in the example unsuitable to charge an otherwise unsuspecting person. If there is already a suspicion due to other circumstances independent of the track, so the test can, however, confirm or dispel the suspicion. Its significance increases, the smaller the population of the eligible author - in the example, this is very large with 10 million - but only so long, how to ensure that the originator of the track is still contained in the population. This can be inferred from a match on authorship, a link has to be found by people first, the objective comes into question. Also important to check whether the offender circle includes people with DNA markers that achieve the same comparison result. To further increase security, it can be attempted to reduce the error rate of the test.

Psychological results

Psychological experiments have shown that the estimation of the probability of a particular statistical variable is highly distorted and deviates from the prevalence, unless previously other properties of the assessed if known, even if it has no predictive value or explanatory value for the occurrence of A.

According to Daniel Kahneman and Amos Tversky, this finding can be explained by the representativeness heuristic. Richard Nisbett has argued that attribution error such as the correspondence bias based on the prevalence Error: The complex prevalence probability of a behavior in a situation is ignored in favor of the simpler dispositional attribution.

107659
de