Recently, I was reading a book called The Cartoon Guide to Statistics, written by Larry Gonick and Woollcott Smith. This book explains the concepts in Statistics and Probability in a delightful manner using only cartoons. Out of the twelve chapters in this book, one chapter is dedicated to Probability, and it is from here that I learnt, rather understood, the basic concepts of probability.
While there are several illustrative problems in this book, I would like to highlight one particularly interesting problem, which gave me an introduction to some of the basic concepts of probability. In this article, I'd like to give that problem statement, give an apparent solution (which is a wrong one), and then show the correct solution. Alongside, I will also provide a Python program which simulates this problem. It is hoped that this problem and solution will motivate you to go through this book, and encounter a different way of learning and understanding statistical concepts.
2. Problem Statement
In the Seventeenth Century, lived in France a person named Chevalier de Mere, who is said to be a gambler, and played a lot with dice. He had an interesting problem, which he posed to his friend, the mathematician and physicist Blaise Pascal. It is said that over the exchange of a few letters between Pascal and his fellow mathematician Pierre de Fermat, the Theory of Probability got evolved.
The problem statement posed by Chevalier de Mere is as follows. Given these two random experiments, which one has a higher probability of occurrence?
- In four rolls of a single dice (an usual six-sided dice), the probability of getting at least one six.
- In twenty four rolls of a pair of dice, the probability of getting at least one double-six.
We next examine the apparent solution, and then the correct solution.
3. The Prima Facie Solution, or Apparent Solution
The prima facie, or apparent solution is that both events above have the same probability. The "proof" of this is as follows:
In what follows, we denote the probability of occurrence of an event as P(event).
For the first problem posed above:
P(one six in one roll of a dice) = 1/6
Therefore, P(one six in four rolls of a dice) = 4 * 1/6 = 2/3
For the second problem posed above:
P(one double-six in one roll of a pair of dice) = 1/36
Therefore, P(one double-six in twenty four rolls of a pair of dice) = 24 * 1/36 = 2/3
This "proves" that both events have the same probability. This was the conclusion drawn by Chevalier de Mere, at that time. It is to be noted that the theory of probability was not yet developed, and he felt that his conclusion was correct.
But, de Mere not only gambled a lot, but also carefully kept records of his wins and losses. His observation was that he was winning more in the first gamble. In the next section, we run these experiments using a Python program.
4. Python Simulation
A Python script is written to simulate both the above events mentioned in the Problem Statement. The main elements of this small program are:
4a. Function to Roll a Dice Once
This function is very simple, and is shown below:
lo = 1
hi = 6
return random.randint(lo, hi)
An important thing to note here is that the function
random.randint() returns random integers from the uniform distribution for the limits specified, such that
lo <= N <= hi. We should not use any weighted distribution here, for that would simulate a weighted dice instead.
4b. Function to Roll a Single Dice 4 Times
This is nothing but the above in a
for loop. One important thing to note here is that as soon as a dice roll returns
6, this function returns
True without doing any further rolls.
for _ in range(4):
if roll_dice() == 6:
4c. Function to Roll a Pair of Dice 24 Times
This is similar to the above function, but here, we check whether the sum of two dice rolls is
12, before returning
for _ in range(24):
if roll_dice() + roll_dice() == 12:
4d. Calculating the Probabilities
There are two other functions named
problem2() where the calculation of probability is done by performing a large number (ten million) of experiments and recording the results.
The output of the Python script is as follows:
Problem 1 - At least one six in four rolls of a single dice
Computed Probability = 0.5178528
Actual Probability from formula = 0.5177469135802468
Problem 2 - At least one double-six in twenty four rolls of a pair of dice
Computed Probability = 0.4914268
Actual Probability from formula = 0.4914038761309034
It is seen from the output that the actual probability is also calculated in the Python script. It is also seen that the probabilities computed from the experiments and those computed from formula match closely, thus validating the program. It is to be noted that the results of each run will be slightly different as these are simulating random events. The formulas for the actual probability calculation are given in the next section.
5. Correct Calculation of the Probabilities
In Section 3 above, we saw the prima facie view which gave erroneous results. The key phrase in the problem statement is the phrase "at least". This phrase makes a key distinction in the probability computation. In this section, we compute the correct probabilities.
5a. Probability of Getting at least One Six in Four Rolls of a Single Dice
We start with the probability of getting no six in a single roll of a dice. This probability is 5/6. In other words,
P(no six in one roll of a dice) = 5/6
Therefore, P(no six in four rolls of a dice) = (5/6)4 = 0.4823
The above uses the Multiplication Formula, where P(E and F) = P(E) * P(F) when the events E and F are independent. In our case, one roll of a dice is independent of another roll of the same dice, they are not connected.
Therefore, P(at least one six in four rolls of a dice) = 1 - P(no six in four rolls of a dice) = 1 - 0.4823 = 0.5177
5b. Probability of Getting at least One Double-six in Twenty Four Rolls of a Pair of Dice
As above, we examine the probability of getting no double-six in a roll of a pair of dice. This probability is 35/36. In other words:
P(no double-six in one roll of a pair of dice) = 35/36
Therefore, P(no double-six in twenty four rolls of a dice) = (35/36)24 = 0.5086
Therefore, P(at least one double-six in twenty four rolls of a pair of dice) = 1 - P(no double-six in twenty four rolls of a pair of dice) = 1 - 0.5086 = 0.4914
From the above, it is seen that these probabilities are indeed different, and the second probability is smaller. This is the reason why Chevalier de Mere was losing more on the second gamble than on the first one. As noted above, this shows that he not only gambled a lot, but also kept meticulous records of his wins and losses.
As briefly mentioned above, the key phrase is at least. For the first event, this means that the following probabilities have to be added to arrive at the correct probability - that of getting one six in four rolls, that of getting two sixes in four rolls, that of getting three sixes in four rolls and that of getting all four sixes in the four rolls. This will be 1 - the probability of getting no six at all. And for the second event, equivalent probabilities have to be added to arrive at the correct probability. Thanks for CodeProject member Marius Bancila for suggesting to incorporating this in the article.
In this article, an interesting problem in probability was considered - a problem involving the rolling of dice. Two events were considered and their probabilities were compared. In the first event, a single dice was rolled four times, and probability of getting at least one six was examined. In the second event, a pair of dice were rolled twenty four times, and the probability of getting at least one double-six was examined. We first saw a prima facie solution which suggested that both these probabilities were the same. We then went through a Python program which simulates both these events, and found that the first event had a higher probability. We then showed how these two probabilities are different, and justified the conclusion drawn by the Python program.
It is sincerely hoped that you found this problem interesting to analyze. It is also hoped that this article interests you to read The Cartoon Guide to Statistics mentioned at the top of this article, and discover statistics in an entirely new manner.
- 7th June, 2018: Version 1.0
Programming computers since about 1987, my first computer language was Fortran 77. Later I learnt C, C++ and C#. Also programmed a little in VB .Net. Worked with Enterprise Java for a short while. I love watching Kannada movies, and listening to Kannada songs. Currently studying and understanding the Bhagavad Geetha and teaching Sanskrit on YouTube.