Institute of Mathematical Statistics | IMS Wald Memorial Lecture preview: Tilmann Gneiting

IMS Wald Memorial Lecture preview: Tilmann Gneiting

March 31, 2026

Tilmann Gneiting serves as head of the Computational Statistics group at the Heidelberg Institute for Theoretical Studies (HITS) and Professor of Computational Statistics at Karlsruhe Institute of Technology (KIT) in Germany. Until 2024, he was a member of the KIT Institute of Stochastics, where he continues to teach; at the beginning of 2025 he moved to the newly established KIT Institute of Statistics. At HITS he held the position of Scientific Director in 2023 and 2024. Previously, he held academic positions at the University of Washington in Seattle, USA, and at Heidelberg University, Germany. From 2016 to 2018 Tilmann served as Editor-In-Chief for the Annals of Applied Statistics. In 2011, he received an ERC Advanced Grant in support of his research on probabilistic forecasts, and in 2024 he was awarded the Ulf Grenander Prize in Stochastic Theory and Modeling by the American Mathematical Society. Tilmann’s research has focused on two main areas: spatial and spatio-temporal statistics, and theory and methodology for forecasting, along with applications, such as in weather prediction.

These two Wald Memorial Lectures will be delivered at the IMS 2026 meeting, in Salzburg, July 6–9, 2026.

The 2026 Wald Memorial Lectures

The first talk concerns the classical topic of quantifying monotone association between random variables. The second talk is on calibration: the statistical consistency between probabilistic forecasts and the respective outcomes.

Wald Lecture I:
Assessing Monotone Dependence: Area Under the Curve Meets Rank Correlation

The assessment of monotone dependence between two random variables is a classical problem in statistics and a gamut of application domains. Consequently, researchers have sought measures of association that are invariant under strictly increasing transformations of the margins, with the extant literature being splintered. Rank correlation coefficients, such as Spearman’s Rho and Kendall’s Tau, have been studied in the statistical literature, mostly under the assumption of continuous margins. In the case of a dichotomous outcome, receiver operating characteristic (ROC) analysis and the asymmetric area under the ROC curve (AUC) measure are used to assess monotone dependence of a binary outcome on a covariate. The talk aims to unify and extend the two thus far disconnected strands of literature, by developing common population level theory, common estimators, and common tests that bridge the continuous and dichotomous settings and apply to all types of linearly ordered outcomes. In particular, we introduce the asymmetric grade correlation (AGC) and coefficient of monotone association (CMA) measures, which correspond to Spearman’s Rho in the continuous case and to AUC for a dichotomous outcome. We establish central limit theorems for their sample versions and develop associated tests. In case studies, we assess progress in data-driven weather prediction and evaluate methods of uncertainty quantification for large language models. Joint work with Eva-Maria Walz and Andreas Eberl.

Wald Lecture II:
Hierarchies of Calibration: Classification and Regression

Concepts of calibration formalize the compatibility between probabilistic predictions and the respective outcomes. In a nutshell, the outcomes ought to be indistinguishable from random draws from the predictive distributions. The talk strives to review and extend notions of calibration that have been proposed for classification and regression tasks. Particular emphasis is given to hierarchical relations between the various notions, as they apply to general real-valued, continuous, nominal, and binary outcomes, respectively. Furthermore, we discuss concepts of calibration that are expressed in terms of properties or functionals of the predictive distribution, such as means, quantiles, or event probabilities. To illustrate the applied and methodological relevance of these notions, we revisit associated decompositions of proper scoring rules and consistent scoring functions into measures of mis-calibration, discrimination, and uncertainty. While calibration checks apply to (out-of-sample) assessments of predictive performance, they relate closely to (in-sample) model diagnostics, and we elucidate these connections in classification and regression settings. Joint work with Johannes Resin and Lu Yang.