I ran across an interesting anecdote this morning while reading *Think Stats* by Allen Downey:

Henri Poincaré was a French mathematician who taught at the Sorbonne around 1900.The following anecdote about him is probably fabricated, but it makes an interesting probability problem.

Supposedly, Poincaré suspected that his local bakery was selling loaves of bread that were lighter than the advertised weight of 1 kg, so every day for a year he bought a loaf of bread, brought it home and weighed it. At the end of the year, he plotted the distribution of his measurements and showed that it fit a normal distribution with mean 950 g and standard deviation 50 g. He brought this evidence to the bread police, who gave the baker a warning.

For the next year, Poincaré continued the practice of weighing his bread every day. At the end of the year, he found that the average weight was 1,000 g, just as it should be, but again he complained to the bread police, and this time they fined the baker.

Why? Because the shape of the distribution was asymmetric. Unlike the normal distribution, it was skewed to the right, which is consistent with the hypothesis that the baker was still making 950 g loaves, but deliberately giving Poincaré the heavier ones.

Along with the following problem:

Write a program that simulates a baker who chooses n loaves from a distribution with mean 950 g and standard deviation 50 g, and gives the heaviest one to Poincaré. What value of n yields a distribution with mean 1000 g? What is the standard deviation?

Compare this distribution to a normal distribution with the same mean and the same standard deviation. Is the difference in the shape of the distribution big enough to convince the bread police?

For those of you who are curious, Poincaré was a 19th Century mathematician, regarded as one of the greatest of all time (you might recognize his name from the Poincaré conjecture, which has made headlines in recent years after it was finally proved by Grigori Perelman). Anyway, I approached this problem using VBA and it wasn’t too hard, taking me about an hour to solve. The first part of the problem involves estimating *n* (the number of loaves baked at the bakery in a day), while the second part of the problem involves a visual comparison between the distribution of Poincaré’s bread loaves and that of a normal distribution.

**Estimating n**

To estimate *n*, I assumed that the weights of the loaves of bread followed a normal distribution with mean 950g and standard deviation 50g. I then simulated the weights using the inversion method, which allows one to simulate normal values by using random numbers drawn from the uniform distribution from 0 to 1. I wrote the following script to simulate 365 trips to the bakery in which the baker gave Poincaré the largest loaf baked that day, and ran 500 trials:

Sub estimate_n() 'script uses the inversion method to generate random weights for loaves of bread 'from a normal distribution with mean 950 and sd 50 Dim unif As Double 'random uniform between 0 and 1 Dim ninverse As Double 'inversion method to get the weight Dim currentsum As Double 'current sum of all the loaves given to poincare Dim ploaf As Double 'weight of loaf given to poincare 'looping variables Dim x As Long 'runs the simulation for the current value of n Dim n As Long 'number of loaves baked at the bakery in a day Dim z As Long 'number of days to run the simulation Dim currentloaf As Long Dim finalaverage As Double For t = 1 To 500 For n = 1 To 100 currentsum = 0 currentloaf = 1 For z = 1 To 365 ploaf = 0 For x = 1 To n unif = Rnd() ninverse = WorksheetFunction.NormInv(unif, 950, 50) If ninverse > ploaf Then ploaf = ninverse Next x currentsum = currentsum + ploaf Next z finalaverage = currentsum / 365 If finalaverage > 1000 Then Range("A" & t + 1).Value = n Exit For End If Next n Next t End Sub

Out of the 500 trials, 388 resulted in 4 loaves of bread yielding an average of about 1000g, and 112 trials resulted in 5 loaves of bread yielding an average of about 1000g. Thus, from these trials I estimated that *n*=4. For the next step, I adjusted the code and set n = 4, and ran 100 trials to simulate yearly averages of the loaf weights.

Sub breadloaf() 'script uses the inversion method to generate random weights for loaves of bread 'from a normal distribution with mean 950 and sd 50 Dim unif As Double 'random uniform between 0 and 1 Dim ninverse As Double 'inversion method to get the weight Dim currentsum As Double 'current sum of all the loaves given to poincare Dim ploaf As Double 'weight of loaf given to poincare 'looping variables Dim x As Long 'runs the simulation for the current value of n Dim n As Long 'number of loaves baked at the bakery in a day Dim z As Long 'number of days to run the simulation Dim t As Long 'run several trials to estimate skew Application.Calculation = xlCalculationManual Dim currentloaf As Long Dim finalaverage As Double For t = 1 To 100 For n = 1 To 4 currentsum = 0 currentloaf = 1 For z = 1 To 365 ploaf = 0 For x = 1 To n unif = Rnd() ninverse = WorksheetFunction.NormInv(unif, 950, 50) If ninverse > ploaf Then ploaf = ninverse Next x Range("A" & z + 1).Value = ploaf currentsum = currentsum + ploaf Next z finalaverage = currentsum / 365 Next n Range("E" & t + 1).Value = WorksheetFunction.Skew(Range("A2:A366")) Next t Application.Calculation = xlCalculationAutomatic End Sub

Now, the tricky part is to compare the resulting distribution, here’s a scatter plot of one of the trials:

Now, looking at the graph, it *sort of* looks skewed to the right but you can’t quite really tell. It might be the case that another trial could be skewed to the left. After running 100 trials, I calculated the skewness for each one and the average skewness was about .25, indicating that the distribution was skewed to the right. Had the baker been honest, we would have expected a skewness of 0 (since the normal distribution is symmetric) instead of .25. Therefore, after this exercise we can conclude that the bakery was indeed cheating its customers!