import random
12) random.seed(
Introduction
Recently, I saw a Twitter user collected an entire deck of playing cards on the street while doing his daily walk. He claimed that it took around 6 months to complete the deck. I was wondering, is the story fabricated? So, I run the simulation to estimate how long does it take to complete the deck under several assumptions.
But, before we move on to the simulation, the story of completing the deck can be model as coupon collector’s problem, which asks the following question:
Given n coupons, how many coupons do you expect you need to draw with replacement before having drawn each coupon at least once?
If we put the statement into our context, the question would be:
Given 52 unique playing cards, how many cards do you expect you need to pick before having picked each card at least once?
The simulation of each pick can be represented by geometric distribution, where it calculates the probability of k failures before the first success. In our case, what is the probability that we pick a unique card that we don’t have yet collected? Since in our first try we do not have any cards, then the probability equals
Generally, the probability that the
That is a lot of equations. Let’s calculate the expected days using Python.
Experiment
def analytic(n):
return sum(n/i for i in range(n, 0, -1))
52) analytic(
235.97828543626736
Or, we could actually calculate the expected days by simulation with the following code:
from random import random, randint
from statistics import mean
def expected_days(n_cards):
= set()
cards = 0
days while len(cards) < n_cards:
1, n_cards))
cards.add(randint(+= 1
days return days
= [expected_days(52) for _ in range(10000)]
days mean(days)
236.0855
As expected, the number is similar. So, given that he pick 1 card everyday, the expected number of days until the deck completed is
def expected_days(n_cards, n_pick=2):
= set()
cards = 0
days while len(cards) < n_cards:
1, n_cards) for _ in range(n_pick))
cards.update(randint(+= 1
days return days
= [expected_days(52, n_pick=2) for _ in range(10000)]
days mean(days)
118.1981
About 4 months. Or to be realistic, there are days when he pick 1 or 2 cards. Assuming the probability is equal, the expected days is equal to
def expected_days(n_cards):
= set()
cards = 0
days = randint(1, 2)
n_pick while len(cards) < n_cards:
1, n_cards) for _ in range(n_pick))
cards.update(randint(+= 1
days return days
= [expected_days(52) for _ in range(10000)]
days mean(days)
176.9557
Under this assumptions, it took around 6 months.
Conclusion
Well, if you asked me again, is the story fabricated? My answer is I do not know, because I just ran the simulation to see how the result changes as I conditioned on my assumptions. But, it’s been fun to model this problem.