Bulletin Editor Anirban DasGupta sets this problem. Student members of the IMS are invited to submit solutions (to bulletin@imstat.org with subject “Student Puzzle Corner”). The deadline is now September 8, 2016.


 

It is the turn of a problem on probability this time. We will consider a problem that looks like a problem on analysis. Many of you know that analysis and probability share a strong synergistic relationship; there are a number of classic texts on how analysis and probability feed into each other. The problem will be left slightly open ended to whet your imagination. Here is the exact problem of this issue:

(a) Let $f$ be a given function on the unit interval $[0,1]$. Define now a sequence of functions $f_n$ by the rule $f_n(x) = \, \mbox{The average value of f over the interval} \,\, [\frac{k}{2^n}, \frac{k+1}{2^n}]$, if $x \in [\frac{k}{2^n}, \frac{k+1}{2^n}]$.
What is the weakest sufficient condition you can provide under which $f_n(x) \to f(x)$ for almost all $x$? Give a proof of your claim.

(b) For extra credit only: Fix an $\epsilon > 0$. Is it true that on some set $B$ with $\int_B\, dx < \epsilon , \,\,\sup_{x \not\in B}\, |f_n(x) - f(x)| \to 0$? That is, is it true that in fact outside of a set of arbitrarily small measure, $f_n$ will converge uniformly to $f$?


 

Solution to Puzzle 14

We had Tom Berrett at Oxford University, Promit Ghosal at Columbia University, and Haozhe Zhang at the Iowa State University [below, left–right] send us correct answers to both parts of the previous puzzle; congratulations to all of them.

Tom Berrett Promit Ghosal Haozhe Zhang
Tom Berrett Promit Ghosal Haozhe Zhang

 

Let us recall the problem. Suppose given $p, X_1, X_2, \cdots $ are iid Bernoulli with parameter $p$. Let $p$ have a prior distribution with a Lebesgue density $g$. Suppose $\theta_n = P_g(X_{n+1} = \cdots = X_{2n} = 1\,|X_1 = \cdots = X_n = 1)$ denote the Bayesian predictive probability that the next $n$ trials will all be successes if the previous $n$ have all been so. Then, find the limit of the sequence $\theta_n$ when $g$ is a general Beta density with parameters $\alpha , \beta $, and for a general $g$ which is infinitely smooth
at $p = 1$.

First consider the case of a Beta prior. In that case, the posterior distribution of $p$ given that $X_1 = \cdots = X_n = 1$ is Beta with parameters $n+\alpha $ and $\beta $. Therefore,

\[ \theta_n = E_{p\,|X_1 = \cdots = X_n = 1}[P(X_{n+1} = \cdots = X_{2n} = 1\,|X_1 = \cdots = X_n = 1, p)] \]
\[ = E_{p\,|X_1 = \cdots = X_n = 1}\,(p^n) \]
\[ = \frac{\Gamma (\alpha + \beta + n)}{\Gamma (\alpha + n)\Gamma (\beta )}\,
\int_0^1\, p^n p^{n+\alpha -1}(1-p)^{\beta -1}dp \]
\[ = \frac{\Gamma (\alpha +2n)\Gamma (\alpha + \beta + n)}
{\Gamma (\alpha + n)\Gamma (\alpha + \beta +2n)} \]

By Stirling’s approximation to $\Gamma (x)$ as $x \to \infty $, and using the notation $\sim $ to mean that the ratio of the two sequences converges to $1$, we now get

\[ \theta_n \sim \frac{e^{-\alpha – 2n}\,(1+\frac{\alpha }{2n})^{2n}\,(2n)^{2n+\alpha }
\,e^{-\alpha – \beta -n}\,(1+\frac{\alpha + \beta }{n})^n\, n^{n+\alpha + \beta }}
{e^{-\alpha – \beta -2n}\, (1+\frac{\alpha + \beta }{2n})^{2n}\,(2n)^{2n+\alpha + \beta }\,e^{-\alpha -n}\,(1+\frac{\alpha }{n})^n\,n^{n+\alpha }} \]
\[ \sim 2^{-\beta }. \]

Notice that the limit of $\theta _n$ does not depend on $\alpha $; this is because only the local behavior of $g$
near $p = 1$ matters for the limiting behavior of $\theta _n$. In particular, if $p$ has a uniform prior, then the limit of $\theta _n $ is $\frac{1}{2}$: you ought to be $50-50$ about the question {\it will our Sun last for the next 4.5 billion years}? Very interesting!

Consider now the case of a general prior density $g$ with infinitely smooth local behavior at $p = 1$. Let $k$ be defined as the unique nonnegative integer such that $g(1) = \cdots =g^{(k-1)}(1) = 0, g^{(k)}(1) \neq 0$. Then, as in the case of the special Beta prior,

\[ \theta _n = \frac{\int_0^1\,p^{2n}\,g(p)dp}{\int_0^1\,p^n\,g(p)dp}. \]

Using {\it Watson’s Lemma}, and once again, Stirling’s approximation,

\[ \int_0^1\,p^n\,g(p)dp \sim \frac{(-1)^k\,g^{(k)}(1)}{k!}\,\frac{\Gamma (n+1)\,
\Gamma (k+1)}{\Gamma (n+k+2)} \sim \frac{(-1)^k\,g^{(k)}(1)}{n^{k+1}}. \]

Hence,

\[ \theta_n \sim \frac{\frac{1}{(2n)^{k+1}}}{\frac{1}{n^{k+1}}}
= 2^{-k-1}. \]