*Your birthday party has 23 guests. Two of them probably share a birthday. But wait, there’s a bizarre twist where you might need to let in 25 on average!*

How many people need to be at a party on average, so that at least two of them share a birthday? This is the kind of problem that Python shines at when you tackle it with a Monte Carlo simulation. Sure, you can find a probability formula (look in Wikipedia to find it) but simulating the situation can be fun and enlightening.

But hold on to your party hat! When I programmed this, I discovered a very strange result that I’m still trying to wrap my head around. Let me explain.

First, here’s a short Python program that agrees with all the experts, and it makes sense.

import randomparties = 10000

hits = 0

days = list(range(365))for i in range(parties):

guest_birthdays = random.choices(days, k=23)

if len(set(guest_birthdays)) < 23:

hits += 1print(hits / parties)

The variable `days`

is a list of the integers 0 to 364, each representing one day of a typical 365-day year. For each party, a set of 23 days are selected randomly from the list of days, to create a list of 23 `guest_birthdays`

. The `choices()`

function uses “with replacement”, which means that the days selected can possibly repeat in the list that’s returned.

By converting this list of 23 guest birthdays into a `set`

, any repeats are removed, making the set’s count less than 23 if any two guests share a birthday. The fraction of times this happens during all the 10,000 parties is calculated, and it averages out to just slightly more than half the time. That in itself is pretty amazing.

Instead of letting 23 people in the door, and then comparing birthdays, let’s let the party-goers in one at a time.

As each guest arrives, we add their birthday number to a list, checking to see if they match any of the guest’s birthdays that are already on the list.

As soon as we find a match we stop letting in any more guests and tally how many it took to get two guests with the same birthday.

After ten thousand parties you’d think the average number of guests for finding a match should be just less than 23. After all, as shown above, with 23 guests the odds are greater than 50–50 that there’s a match.

*But it turns out that you need to let in 25 guests, on average, before you find any two guests with shared birthdays!*

Here’s a short Python program that demonstrates this action.

import randomparties = 10000

total_guests = 0for i in range(parties):

guest_birthdays = []

while True:

bd = random.randrange(365)

guest_birthdays.append(bd)

if bd in guest_birthdays[:-1]:

break party_guests = len(guest_birthdays)

total_guests += party_guestsprint(total_guests / parties)

For each of the 10,000 parties, we let guests in (append their birthday number to the list) until we finally have two guests with matching birthdays. Notice that we check the just-appended birthday to see if it exists anywhere earlier in the list using this code… `if bd in guest_birthdays[:-1]:`

The number of guests lets in the door at each party is tallied and used to calculate the average number of guests per party before matching birthdays were found.

I totally expected the average number of guests to be just under 23, based on the standard puzzle. But the number of guests always averages to about 24.6, meaning that it takes 25 one-at-a-time guests, on average, to make sure there’s a match over half the time. Bizarre.

Whenever I encounter challenges like this I tend to run the situation to its extremes to see if it simplifies the logic.

In this case, instead of 365 possible birthdays, I set up the code for a binary situation. If party-goers are randomly male or female, how many do you need to let in the door to get two of the same sex? Looked the other way, if there are two people at a party, what are the odds they are of the same sex?

*Hint: the answers are not the same!*

I’m humble enough to admit I still don’t have a clear explanation for the difference between the two programs above, and I welcome any forthcoming explanations that are, hopefully, clear and concise. *Do you have one?*