Contributing Editor Xiao-Li Meng writes:

A Nobel Prize in Statistics? Well, almost. The launching of the International Prize for Statistics (IPS), with its explicit references to the Nobel Prize (NP) and other major awards [see this link], aims to establish IPS as “the highest honor in the field of Statistics.” And its inaugural winner, Sir David Cox, is inarguably one of the two living statisticians who can instantly signify this intended status of IPS. However, many will argue about who are the N statisticians deserving this inaugural IPS, and indeed about the value of N itself. Whereas my N=2, I would not ruin your fun for imputing my other choice based on publicly available data, in case you are bored with your own list.

The data came from Some Nobel-Prize (NP) Worthy i.i.d Ideas in Statistics, a discussion I presented at JSM 2016. The “i.i.d.” criterions refer to “Ingenious, Influential, and Defiable.” The first two are obvious, and the third is necessary because any scientific idea must have demonstrable limitations, i.e., it can be “defied/defeated”. Fisher’s likelihood is an early example of an NP-worthy i.i.d idea. An ingenious flipping, from probability space to parameter space, created an exceedingly influential paradigm for statistical inference. Yet it is not almighty. A likelihood inference can lead to inadmissible or inconsistent estimators, and the “flipping” idea itself can result in complications, as revealed by the common description of how Fisher created his fiducial distributions.

The issue of inadmissibility naturally leads to Stein’s shrinkage estimation. The shrinkage phenomenon was considered paradoxical when Stein discovered it, and indeed a (statistically fluent) neurobiologist colleague recently told me that he just cannot comprehend how such a phenomenon could occur. Its impact, via the more encompassing framework of hierarchical modeling, is tremendous. Yet its occurrence depends on the choice of loss function.

Cox’s proportional hazards model is another unexpected finding: by using only the ranking information in the data, and hence a partial likelihood, one can eliminate entirely an infinite dimensional nuisance parameter, i.e., the baseline hazard. It is this work of Sir David that won him the inaugural IPS, and it is truly deserving by any measure. Practically, it has been applied virtually in every field requiring quantitative investigations of the risk factors in survival time. Academically, it opened up a new area of theoretical and methodological research, including on its limitations and generalizations (e.g., when the hazard is not proportional).

Bootstrap “literally changed my life,” as declared by my neurobiologist colleague, and it certainly has made many researchers’ lives much easier. Yet those who attended Efron’s seminar at Stanford announcing it still recall how skeptical the audience was: “No one believed it, as it was just too good to be true,” as one of them told me. And such skepticism was and still is healthy, because bootstrap does not always work. Indeed, Efron’s 1979 article on bootstrap has literally generated an industry of research on proving when it works, when it doesn’t, and how to make it work when its vanilla version fails. Intriguingly, the topic became so popular that for a while my thesis adviser Donald Rubin was more known in some circles for his paper on Bayesian bootstrap, than for his far more influential (earlier) work on missing data, causal inference, etc.

Incidentally, as I conveyed to Don, among his many contributions, I always regard his work with my eldest academic brother Paul Rosenbaum on propensity score matching (PSM) most unexpected. Controlling confounding factors in observational and other studies is of paramount importance, and matching methods are both intuitive and easy to implement. A common challenge with matching methods is that one quickly runs out of sample sizes as one tries to eliminate as many confounding factors as possible. The ingenuity of PSM is that we only need to match on one index, the propensity score, which has led to its enormous popularity. Of course, there is no free lunch here. Not only does the method require modeling assumptions, but also it cannot (directly) control for unmeasured confounding factors.

This leads to the most NP-worthy idea on my list, randomization. It controls for all confounding factors, known, unknown, and unknown-unknown. A simple random sample of 400 can easily produce the same mean squared error as a self-reported data set covering half of US population, that is, about 160,000,000, with a seemingly negligible self-selection bias; see proof in my recent RSS presentation at https://www.youtube.com/watch?v=8YLdIDOMEZs (with apologies to those who hate self-referencing). Of course, the limitation of randomization is that often it is an unachievable dream.

“Xiao-Li, your talk is dangerous,” said a friend who was worried that I might have hurt many people’s egos for omitting their NP-worthy ideas. But I’d summarize these six ideas by a different d-word: deceptive. At the first glance, all six appear to be too good to be true or too simple to be useful. Yet years of research and applications have demonstrated that they are incredibly powerful statistical (IPS) ideas, ideas we all wish to bear our names.

So what’s your IPS idea and/or IPS list?

Leave a comment below, or email bulletin@imstat.org.