The Student Puzzle Corner contains one or two problems in statistics or probability. Sometimes, solving the problems may require a literature search.
Current student members of the IMS are invited to submit solutions electronically (to bulletin@imstat.org with subject “Student Puzzle Corner”). Deadline March 1, 2014.
The names and affiliations of the first 10 student members to submit correct solutions, and the answer(s) to the problem(s), will be published in the next issue of the Bulletin. The Editor’s decision is final.

Student Puzzle Corner 2

In a house, there are six cuckoo clocks. They are showing these times 9:44, 9:46, 9:34, 9:45, 8:57 and 9:44. We want to guess the correct time $μ$. We have to model the problem. Here is how the six data values were generated. A subset of the six observations were generated from a normal distribution with mean $μ$ and standard deviation $\frac{1}{30}$ (meaning 2 minutes); the rest of the six observations were generated from a Cauchy distribution with parameters $μ$ and 1. Thus, we think that a subset of the cuckoo clocks have become a bit inaccurate, and the others have gone completely erratic. You are not told how many and which data values came from the normal distribution. Can you guess $μ$? Give your answer in hours and minutes; e.g., 10.5 will mean 10:30.

We will publish the names and affiliations of the first 10 respondents who match the true $μ$ exactly.

Solution to previous problem

Anirban DasGupta, IMS Bulletin Editor, writes:

A correct numerical answer to the Puzzle Corner problem in the January/February issue was sent by Rico Blaser, London School of Economics. The solution uses the elegant technique of Poissonization. Suppose $p_n$ is the probability that each child gets 2 or more cookies when the number of cookies to be distributed is a fixed number $n$. If the number of cookies to be distributed is random, having a Poisson distribution with mean $λ$, then $X_1, X_2, … ,X_{10}$, the number of cookies received by the ten children, become iid Poisson with mean $\frac{λ}{10}$. This technique of Poissonization turns the dependent multinomial cell frequencies into independent Poissons. So the probability that each child receives 2 cookies or more after Poissonization is $(1 − e^\frac{−λ}{10} − \frac{λ}{10} e^\frac{−λ}{10} )^{10}$. On the other hand, this probability also equals
$\sum\limits_{n=o}^\infty \frac{e^{-λ}λ^n}{n!}p_n$

Since two convergent power series must have identical coefficients in order to coincide on a non-empty interval, we recover $p_43$ as 43! times the coefficient of $λ^{43}$ in
$e^λ [1 − e^\frac{−λ}{10} − \frac{λ}{10} e^\frac{−λ}{10} ]^{10}$; this gives the needed probability as
$\frac{38360235213946776318553037176114920309}{38360235213946776318553037176114920309}$ = 0.4910110107385.

Poissonization is a blessing for the applied statistician or the data miner trying to cope with seemingly impossible multinomial problems. It is clever, yet graceful. I would recommend Feller and also Sydney Port’s lovely book Theoretical Probability for Applications. You can see a few relatively simple applications of Poissonization in Springer’s Probability for Statistics and Machine Learning. Poissonization has been used in developing Stein approximations in complex multinomial problems; the by now classic Barbour, Holst, and Janson (Oxford, 1992), will give you a first glimpse. You can see later advances in the works of many researchers, too many to list here.