Deadline September 15, 2020.

Student members of IMS are invited to submit solutions to bulletin@imstat.org (with subject “Student Puzzle Corner”). The names of student members who submit correct solutions, and the answer, will be published in the issue following the deadline. The Puzzle Editor is Anirban DasGupta. His decision is final.

 

We pose a simple problem with an element of prettiness this time. Anyone can understand the problem, as simple as it is. Suppose we simulate the values of an integer-valued random variable on a computer. We know from our empirical experience that after some time, it becomes really hard to see a new value, a number that we have not already seen in our simulation. Similarly, if we simulate an integer-valued variable that is almost a point mass, it takes a long time before we see two or more distinct values in our simulation. We want to understand these empirical experiences. So, here is our exact problem for this month:

(a) Suppose $X_1, X_2, \cdots $ are iid random variables with the PMF $P(X_1 = j) = f(j), j = 0, 1, 2, \cdots $. Let $p_n = P(X_n \,\mbox{is a new value distinct from each of}\, X_1, \cdots , X_{n-1}). $ Prove or disprove: For any mass function $f(.)$, $p_n \to 0$ as $n \to \infty $.

(b) Suppose again that $X_1, X_2, \cdots $ are iid random variables with the PMF $P(X_1 = j) = f(j), j = 0, 1, 2, \cdots $. Given $k \geq 1$, let $\tau_k = \mbox{inf}\{n: |\{X_1, X_2, \cdots , X_n\}| = k\}$; that is, $\tau _k$ is the first time that the number of distinct values in our simulation reaches the number $k$. Derive a formula for $E(\tau_2)$ and $E(\tau _3)$.

(c) Now suppose the underlying sequence is an i.i.d. Poisson sequence, that is, for some $\lambda > 0, f(j) = \frac{e^{-\lambda }\,\lambda ^j}{j!}, j = 0, 1, 2, \cdots $. Find $E(\tau_2)$.

(d) For part (c), find the limit of $P(\tau_2 = 2)$ (i) as $\lambda \to 0$, (ii) as $\lambda \to \infty $.

(e) Prove that $P(\tau_2 = 2)$ is strictly monotone in $\lambda $.

 

Bonus Puzzle: The Lonely Runner Conjecture

An entertaining problem with many applications has $n$ runners running indefinitely around a circular track of length 1. They have distinct, constant speeds $v_1, v_2, \cdots , v_n$. They start at time $t = 0$ from the same point. At a given time $t$, runner $i$ is called a lonely runner if the nearest neighbor of $i$ is $\geq \frac{1}{n}$ distance away from $i$. The conjecture, yet unproved in its full generality, is that every runner $i$ is a lonely runner at some time $t$, depending on $i$.

Of numerous things you can try, here is one. Take speeds $v_k = X_1 + X_2 \cdots +X_k +k$, where $X_i$ are iid Poisson with mean 1. What does simulation suggest about the first time that a lonely runner is spotted, and which runner is that first lonely runner?

 

Solution to Puzzle 29

Contributing Editor Anirban DasGupta provides the solution to the previous problem, which was about epidemiology. He writes:

By the method of indicator variables,
$E(Y\,|X = x) = m\,\frac{\binom{k(m-1)}{x}}{\binom{km}{x}}$.
One may use the moment estimate $E(Y\,|X = x)$ as a statistical estimate for $Y$.

The conditional PMF of $Y$ is
$ P(Y = y\,|X = x) =  \frac{\binom{m}{y}}{\binom{km}{x}}
\,\times \,\left[{\binom{ky}{x}}-{\binom{y}{1}}\,{\binom{k(y-1)}{x}}
+ {\binom{y}{2}}\,{\binom{k(y-2)}{x}} – \cdots \right], $
the binomial series terminating when $k(y-i)$ becomes less than $x$. One way to see this is to just use the standard De Moivre formula for the probability that exactly $r$ of $n$ events occur.