[RC5] Birthday Paradox

Greg Hewgill gregh at lightspeed.net
Fri Mar 20 20:01:37 EST 1998


>My goodness, is this something you learn in stats?  I personally wouldn't
>mind a quick little overview of why 23 equals 50% likelihood of a
duplicate
>of 365 possibilites...


The easiest way to explain this is to investigate the probability that
everybody has a *different* birthdate.

Let P(n) be the probability that n people have different birthdates. The
first person to enter the room obviously has a different birthday than
everybody else, so P(1) = 1. That is, there is a 100% chance of this
happening.

In order for the second person who enters the room to have a different
birthdate as the first person, they have 364 choices out of a possible 365
(ignoring the leap year phenomenon). So P(2) = P(1) * (364/365) = 0.99726,
or a 99.726% chance. We multiply the probabilities together because this is
how you compute the probability of event A and event B happening
simultaneously.

The third person has 363 dates to choose from, so P(3) = P(2) * (363/365) =
0.9918. Expanding this out, you get:

    P(3) = P(2) * (363/365)
    P(3) = 1 * (364/365) * (363/365)

Continuing like this, in general:

    P(n) = n! / ((365-n)! 365^n)

where ! is factorial and ^ is exponentiation. Here is a table of results
through n=29:

n       P(n)
1       1
2       0.997260273972603
3       0.991795834115219
4       0.98364408753345
5       0.972864426300207
6       0.959537516350889
7       0.943764296904025
8       0.925664707648331
9       0.905376166110833
10      0.883051822288922
11      0.858858621678267
12      0.832975211161936
13      0.805589724767571
14      0.776897487995027
15      0.747098680236314
16      0.71639599474715
17      0.684992334703439
18      0.65308858212821
19      0.620881473968463
20      0.58856161641942
21      0.556311664834794
22      0.52430469233745
23      0.492702765676014
24      0.461655742085471
25      0.431300296030536
26      0.401759179864061
27      0.373140717736758
28      0.345538527657601
29      0.319031462522223

At 23 people, the probability of everybody having a different birthday
drops below 50%. So the probability of some two people having the *same*
birthday (the opposite of the above case) rises above 50%. The probability
of some pair of people sharing the same birthday in a group of 23 people is
about 50.73%.

For the ECCp-97 challenge, the number of DPs is 1073741824. My calculations
give n=38582 when the 50% mark is crossed. TimC recently said their 50%
mark was around 450000. Did I screw up somewhere? Here's the perl program I
used to calculate my results:

#! /usr/bin/perl -w

# $N = 365;
$N = 1073741824;

$p = 1;
$i = 1;
while (1) {
  $p *= ($N-($i-1))/$N;
  print "$i\t$p\n";
  last if $p < 0.5;
  $i++;
}

Can anybody tell me where I've gone wrong?
--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest



More information about the rc5 mailing list