I recently experienced three consecutive days that are major anniversaries for me: June 12 is my baptismal anniversary, June 13 is my birthday, and June 14 is my wedding anniversary (also my daughter turned 15 months old on this day). These events put to the forefront of my mind a famous problem in probability, the birthday problem.

— The Uniform Probability Model for a Finite Sample Space —

There are many different probability models. We’ll start with the simplest of these models, the uniform probability model for a finite sample space.

Let ${S}$ be a finite sample space. The uniform probability model ${P}$ for ${S}$ is defined by

$\displaystyle P(E)=\frac{|E|}{|S|}.$

${P}$ as defined satisfies the axioms for a probability measure. By adopting this model, we are predicting that in a sequence of experiments with sample space ${S}$ the observed relative frequency of events${E}$ in the sequence of experiments will be equal to the relative frequency, ${|E|/|S|}$, of events ${E}$ in the set ${S}$. In particular, for each outcome ${s\in S}$, we have ${P(\{s\})=1/|S|}$ , or as one says, “All outcomes in ${S}$ are equally likely (i.e., equally probable)." Incidentally, for events ${E=\{s\}}$ consisting of a single outcome are sometimes called elementary events.

Problem 1 (The birthday problem) In a room of ${n}$ people, what is the probability ${p_{n}}$ that at least two have the same birthday (i.e., same month and day of the year)? Find the smallest ${n}$ such that ${p_{n}\geq\frac{1}{2}}$. Neglect February 29 and assume a year of ${365}$ days.

Solution: Let ${S}$ be the sample space that consists of ${365^{n}}$ ${n-}$tuples ${(b_{1},b_{2},\ldots,b_{n})}$, where ${b_{i}}$ is birthday of the ${i^{th}}$ person in the room. Note, that the probability is

$\displaystyle \begin{array}{rcl} p_{n} & = & 1-P(\mbox{all n individuals have different birthdays})\\ & = & 1-\frac{365{}^{\underline{n}}}{365^{n}}, \end{array}$

assuming all ${n-}$tuples ${(b_{1},b_{2},\ldots,b_{n})}$ are equally likely. Below are some values of $p_{n}$:

$\begin{array}{|c|c|} p & p_{n} \\ \hline 5 & 0.027 \\ 10 & 0.117 \\ 20 & 0.411 \\ 23 & 0.507 \\ 30 & 0.706 \\ 40 & 0.891 \\ 60 & 0.994 \\ \hline \end{array}$

Remarkably, in a room with just ${23}$ people, the probability is greater than ${1/2}$ that two or more people have the same birthday! $\Box$

— Binomial Random Variables and Poisson Approximations —

In the case of a binomial experiment there are ${n}$ independent trials, with the results of each trial labeled “success" or “failure." On each trial the probability of success is ${p}$ and (hence) the probability of failure is ${1-p}$. The sample space ${S}$ consists of the set of ${2^{n}}$ words in the alphabet ${\{s,f\}}$ and the random variable ${X}$ on ${S}$ is defined by ${X}$ (each word) ${=}$ the number of ${s}$‘s in that word. The probablity density function (pdf) ${f_{X}}$ for ${k=0,1,\ldots,n}$ is given by

$\displaystyle f_{X}(k)=P(X=k)=\binom{n}{k}p^{k}(1-p)^{n-k}.$

We say that such a random variable ${X}$ is a binomial random variable with parameters ${n}$ and ${p}$, abbreviating this with the notation ${X\sim binomial(n,p)}$. We also say that X has a binomial distribution with parameters ${n}$ and ${p}$, and that ${f_{X}}$ is a binomial density function with parameters ${n}$ and ${p}$. Note, in general, if ${X\sim binomial(n,p)}$, then ${P(X\geq1)=\sum_{k=1}^{n}P(X=k)=1-P(X=0)=(1-p)^{n}}$.

If ${X\sim binomial(n,p)}$, where ${n}$ is “large" and ${p}$ is “small," then for ${k=0,\ldots,n}$

$\displaystyle P(X=k)\approx\frac{\lambda^{k}}{k!}e^{-\lambda},$

where ${\lambda=np}$. This is called the Poisson approximation (Denis Poisson, 1781-1840) to the binomial distribution.

Remark 1 It can in fact be shown that the accuracy of this approximation depends largely on the value of ${p}$, and hardly at all on the value of ${n}$. The errors in using this approximation are of the same order of magnitude as ${p}$, roughly speaking.

Problem 2 Given ${400}$ people, estimate the probability that 3 or more will have a birthday on June 13.

Solution: Assuming a year of ${365}$ days, each equally likely to be the birthday of a randomly chosen individual, if ${X}$ denotes the number of people with a birthday on June 13 among ${400}$ randomly chosen individuals, then ${X\sim binomial(400,\frac{1}{365})}$. The exact answer to this question

$\displaystyle \begin{array}{rcl} P(X\geq3)& = & 1-P(X\leq2)\\ & = & 1-\sum_{k=0}^{2}\binom{400}{k}\left(\frac{1}{365}\right)^{k}\left(\frac{364}{365}\right){}^{400-k}\\ & = & 0.09850825486213655 \end{array}$

The Poisson approximation of this quantity is

$\displaystyle \begin{array}{rcl} 1-\sum_{k=0}^{2}\frac{(400/365)^{k}}{k!}e^{-400/365} & = & 1-e^{-1.096}(1+1.096+\frac{(1.096)^{2}}{2})\\ & \approx & 0.099. \end{array}$ $\Box$