Birthday problem

The birthday paradox, sometimes called the birthday problem is an example that certain probabilities (and accidents ) are intuitively often incorrectly estimated:

To estimate the probability of false results because is asked in the birthday paradox according to how likely it is that two people from a group on any day of the year have a birthday. Erroneously, the problem is often interpreted as " how likely it is that a person from a group on a particular day in the year 's birthday " (eg compliance with the birthday of another, additional person), and this probability is actually significantly smaller.

The paradox is often attributed to Richard von Mises. According to Donald E. Knuth, this origin is not certain: The birthday paradox has been discussed informally among mathematicians as early as the 1930s, but a more accurate copyright can not be determined.

  • 2.1 probability that at least two people in a day Birthday
  • 2.2 probability for a particular day

Containment

The results of the question:

Is amazing for most and is therefore perceived as a paradox. So most people estimate the probability an order of magnitude one wrong. It is not between one and five per cent ( as is usually estimated), but over 50 %; 50 persons, even over 97 %.

In contrast to this is the likelihood that someone has at a certain day (excluding consideration of the vintage ) Birthday: If, for example, takes one of the 23 people and asks that anyone with exactly this same day his birthday. Thus, if by the birthday of one of the persons present set of specific day ( a total of 254 persons so ) are further 231 people are required to achieve a probability of 50 % (see binomial distribution ).

The reason for this large difference is that there are different pairs in people who may have birthday on the same day. Therefore, the probability of meeting or colliding of two birthdays increases for small values ​​of approximately with the square of the number.

Unevenly distributed Birthdays

In reality, all birth dates are not equally likely, then, for example, more children born in the summer than in winter. This increases the probability that two people have the same birthday, easy to. However, simulations show that exceeds even for real data, the probability that two people have the same birthday, still at 23 people 50%. Even taking into account the neglected in the derivation leap day does not change anything.

Importance in cryptography

This effect has a meaning in cryptographic hash functions, which should have an unambiguous test value from a text. It is much easier to find two random texts that have the same test value, as to find a given text another, the same test value has (see birthday attack ).

Mathematical derivations

In the following February 29 is neglected and assumed that the birthdays of the people are independent, identically distributed random variables from the discrete uniform distribution on the 365 - element set { 1 January, January 2, ..., December 31 } are. This assumption is not met then, for example, if there are among the people present twins.

In the urn model, this assumption corresponds to a contraction of balls with replacement from an urn, the "365 balls labeled 1 January, "" 2 January, " etc. to " 31 December contains ".

Probability that at least two people in a day Birthday

The number of all possible cases for n persons m = 365n, where all cases are equally likely. For example arise for two people 3652 = 133225 possible cases of birth variations.

Include these possible cases

Only different birthdays. For the first person 's birthday can be freely chosen, for the second, there are then 364 days, at which the first has not birthday etc.

This results in the formula of Laplace, the probability of

That all n people on different days of your birthday.

The probability of at least two birthdays in the course of a year is thus

For n = 23 we have:

After the pigeonhole principle is (ignoring February 29 ) for all n> 365, the probability is equal to 1, so there are certainly two people with the same birthday. If February 29 is not neglected as a birthday, then this is valid only on n> 366

An approximation

The expression for P can be transformed:

Using the Stirling formula, this can be too good approach

Which can be evaluated easily with a calculator.

Probability for a particular day

Another question is when one does not look for any matches of birthdays, but according accordance with a procedure selected day of the year.

Ignoring as before February 29, then the probability for a person to have on such a specific day birthday, equal 1/365 ≈ 0.27 %.

The probability of the opposite, ie the probability on any given day not to have a birthday, making it

When two people is the probability that neither has the day previously selected by both birthday, same (as before, we assume that the birthdays of the people are dependent).

To have at least one goal ( at least one person of two has on a specific day of birth ), is again the counter- probability, ie

Continuing in this way for larger numbers of people to obtain the probability that at least one person of persons present on a given day 's birthday is

This makes it possible to calculate how many people you need to reach a certain probability P that at least one person on a particular day 's birthday:

For a probability of 50 % is required

Finally, calculated for the case that one of the persons present 's birthday, the probability that at least one of the other people has the same birthday, to

In contrast to the probability that at least two people in one day have a birthday (see above), there is no n for which one can make a reliable statement: for any number of persons there is the possibility that the selected date is not present as a birthday ( the pigeonhole principle is not applicable ). For all n we have

Related Questions

In the game memory pairs are 2N maps reveal (consisting of N pairs ). At the beginning of the game are all the cards face down, and as long as several cards are revealed, players have the opportunity to find a pair just happened. This raises the question - similar to the birthday paradox - how many cards you have to uncover in order (eg 50 %) to obtain a certain probability of at least one pair.

The number N of different motives here corresponds to the number of days in the year (365 ) in the birthday paradox. Usually, memory is played with 32 pairs, but there are also other variants, so it makes sense to keep the number N variable.

If, for the probability of n cards by revealing only uncover different cards, then:

As a result, one gets for N = 32: 10 Detection of cards in the probability is greater than 50% to obtain at least one pair (1 - P32 ( 10) = 56.4 %). For N = 50, the limit is 12 cards. In a hypothetical memory with 183 pairs have to discover 23 cards, with 365 pairs of 32 cards are necessary.

This result has important practical implications for the game because the players would lose interest if it takes too long to get the first pair is revealed.

363342
de