Yoav Benjamini is the Nathan and Lily Silver Professor of Applied Statistics at the Department of statistics and Operations Research at Tel Aviv University. He holds BSc. In Physics and BSc and MSc. In mathematics from the Hebrew University (1976), and PhD in Statistics from Princeton University (1981). He is a member of the Sagol School of Neuroscience and the Edmond Safra Bioinformatics Center, both at Tel Aviv University. He was a visiting professor at Wharton, UC Berkeley, Stanford and Columbia Universities. Prof. Benjamini is a co-developer of the widely used and cited False Discovery Rate concept and methodology. His research topics are selective and simultaneous inference, replicability and reproducibility in science, and data mining, with applications in Biostatistics, Bioinformatics, Animal Behavior, Brain Imaging and Health Informatics. He received the Israel Prize for research in Statistics and Economics, is a member of the Israel Academy of Sciences and Humanities, and has been elected to receive the Karl Pearson Prize of ISI this summer.
Yoav Benjamini’s Rietz Lecture will be given at JSM 2019 in Denver, USA (check the program at http://ww2.amstat.org/meetings/jsm/2019/onlineprogram/index.cfm).
Selective inference: The silent killer of replicability
Replicability of results by other scientists has been a vital gold standard in science and should remain so. Concerns about lack of replicability increased in recent years, alongside the ‘industrialization’ of the scientific process with its generation of many potential discoveries. Transparency, good design and reproducible computing and data analysis are perquisite for replicability of the result, and have already been identified to have this important role. The importance of adopting appropriate statistical methodology has also been identified, yet which methodologies can be used to enhance replicability of results from a single stand-alone study remains debated, with ASA contributing formally to this debate.
I argue that addressing selective inference is essential for enhancing replicability, and demonstrate how ignoring it in current practices is harmful. This might be surprising, because selective inference is addressed in very complex problems in genomics, proteomics, functional imaging and other fields where the number of results screened for the few interesting ones is in the thousands. Yet this is not the case in much of pre-clinical and clinical Medical Research, Epidemiology, Experimental Psychology. In these areas, though the number of potential discoveries that are evident in the published study is large, it is still not in the thousands so apparently the potential harm is not apprehended. Unfortunately, many of the proposed solutions to the replicability problem. including those promoted by ASA, similarly ignore this issue.
I shall then review available approaches for addressing selective inference, and devote some time to (i) a less trodden strategy, that of offering simultaneous inference on the selected; and to (ii) a recent methodology, that of addressing selection in a hierarchical system of inferences. The second will be used for the analysis of microbiome data and to address the emerging problem of selective inference when a medical database is probed by different investigators.
Finally, we have to face the fact that replicability in a single study can only be enhanced. Replicability, and it closely related concept of generalizability, can only be assessed by actual replication attempts. Therefore, making replication an integral part of regular scientific work becomes crucial. I shall spell out a way requiring the efforts and cooperation of all parties involved: scientists, statisticians, publishers, granting agencies and academic leaders.
Who was H.L. Rietz, and why do we have a named lecture in his honor?
Henry Lewis Rietz (1875–1943) was the first President, in 1935, of the Institute of Mathematical Statistics. He is credited with the early growth of interest in mathematical statistics.
Born in Gilmore, Ohio, Rietz received his BS degree from Ohio State University in 1899, then moved to Cornell University, first as a scholar, then fellow and assistant in mathematics. Following his PhD in 1902, he spent a year at Butler College, Indianapolis, then accepted an instructorship at the University of Illinois, where he stayed for 15 years. In 1918 he moved to the University of Iowa to head the Department of Mathematics, staying until his retirement in 1942. In his second year at Illinois, a demand arose for a course in statistics. Since nobody else wanted to teach it, Rietz was induced to try. So he offered a course, ‘Averages and Mathematics of Investment’, which led to his joint appointment as statistician at the College of Agriculture. From 1908 onwards, Rietz published 150 papers on statistical and actuarial topics – though it was difficult at first to find a place of publication for a mathematical statistics paper. His 1926 book Mathematical Statistics was the basis for many university courses for years afterwards. His many honors include fellowship of IMS, Royal Statistical Society and the American Association for the Advancement of Science. In appreciation of Rietz’s contributions to the IMS, the 1943 volume of the Annals of Mathematical Statistics was dedicated to him…and, of course, the Rietz Lectures, intended to be of broad interest and clarifying the relationship of statistical methodology and analysis to other fields.