Jing Lei obtained his PhD in Statistics from UC Berkeley in 2010 and joined Carnegie Mellon University in 2011. He is currently a professor of Statistics & Data Science at CMU. His research areas include nonparametric and predictive inference such as conformal prediction and cross-validation, high-dimensional inference, network data analysis, data privacy, and applications in single-cell genomics. Jing received the ASA Gottfried E. Noether Early Career Award in 2016, was elected an IMS Fellow in 2021, and was one of the two recipients of the ASA Leo Breiman Junior Award in 2023. He currently serves as an associate editor for Annals of Statistics, Journal of the American Statistical Association, and Journal of the Royal Statistical Society, Series B. This Medallion lecture will be delivered at JSM in Portland, USA, August 3–8, 2024: https://ww2.amstat.org/meetings/jsm/2024/index.cfm

Uncertainty Quantification with Nonparametric and Black-Box Models

The fast-increasing complexity and dimensionality in modern data analyses pose new challenges for nonparametric inference. Meanwhile, the emergence of powerful neural networks provides an unprecedented level of success in prediction. The intricate nature and diverse form of neural networks make them quintessential examples of “black-box models,” prompting new theoretical and methodological questions: How do we understand the uncertainty in these models? How can we reliably use them in statistical inferences?

In this lecture I will showcase two such inference problems, both stemming from conformal prediction. The first part focuses on understanding the uncertainty in the empirical performance of nonparametric and black-box models. We consider the problem of comparing many competing models using cross-validation, where the main challenge is the possibility of multiple good models and the uncertainty in the cross-validated risks. I will present some recent results on constructing model confidence sets using cross-validation under stability conditions. In the second part, we study the problem of testing two-sample conditional distributions, where black-box predictors are often used to estimate various nuisance parameters. Following an approach inspired from conformal prediction, I will demonstrate that conformal prediction effectively separates the nuisance parameters into two parts: a part that needs to be estimated accurately to ensure validity of inference, and another part whose accuracy will only affect the efficiency.