Peter Bühlmann studied mathematics at ETH Zürich (ETHZ) and received his doctoral degree in 1993 from the same institution. He was a Postdoctoral Fellow from 1994–95 and a Neyman Assistant Professor from 1995–97 at UC Berkeley, before returning to ETHZ in 1997. From 2013–17, he was Chair of the Department of Mathematics at ETHZ and he is currently the Director of ETH Foundations of Data Science. He is an IMS Fellow, and was IMS President 2022–23; he is a Fellow of the ASA, and was Co-Editor of the Annals of Statistics from 2010–12. His honors include Doctor Honoris Causa from UCLouvain in 2017, the 2018 Guy Medal in Silver from the Royal Statistical Society, and membership of the German National Academy of Sciences, Leopoldina (since 2022).

This Wald Lecture will be delivered at the 11th World Congress in Probability and Statistics in Bochum, Germany, August 12–16, 2024.

 

Invariance in multiple data distributions

I decided to give only one Wald Lecture because I wanted to give more time to other––and especially younger––attendees of the BS–IMS World Congress. The topic of my lecture is about a theme at the intersection of statistics, machine learning, and interdisciplinary applications.

Invariance for multi-source data and relations to causality. Modern data science problems frequently center on the analysis of multi-source, also known as multiple environment or perturbation datasets. These data collections typically lack the structure of designed experiments and instead exhibit unspecified perturbations. Such data is at the core of several problems that have been extensively studied in the literature, including but not limited to covariate shift, domain adaptation, and transfer learning [9, cf.].

An interesting approach is given by invariance, which asks that distributional aspects remain invariant across different perturbations.  A prime example is that the conditional distribution L(Y |XS) remains invariant where Y is a response, X are covariates and S denotes a subset of the covariates: that is, if conditioning on the covariates of S, one obtains an invariant conditional distribution across perturbations. Under certain assumptions on the perturbations, the set S corresponds to the causal variables of the response Y : this has been formulated by Haavelmo [3] and further characterizations between invariance and causality have been given by [5]. The latter work contributed to a new development of “causality inspired” machine learning [4, 6, 1, cf.]. This leads to the next paragraph.

Improved generalization and distributional robustness for machine learning. Making machine learning (and “AI algorithms”) reliable and robust for new scenarios and settings is a topic of much recent activity. The concept of invariance can be used as a regularization scheme which provides distributional robustness [7].  The difference to more classical distributional robustness [2] is partially understood: there is much to be gained from invariance regularization when the new scenario is “spanned” by the (richness) of the observed multi-source data. Recent [8, cf.] and ongoing work will be discussed.

Relevance for the sciences. Many scientific questions are rooted in causal questions. However, answering such questions is often very ambitious without access to randomized experiments. The invariance framework with its enhanced domain generalization, particularly also in combination with existing methods, models and algorithms, provides a powerful paradigm for better interpretation and prediction in new contexts. We will highlight its importance for prediction in critical care medicine and for drug discovery based on proteomics data.

 

References

[1] M. Arjovsky, L. Bottou, I. Gulrajani, and D. Lopez-Paz, 2019. Invariant risk minimization. arXiv preprint arXiv:1907.02893.

[2] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski, 2009. Robust Optimization. Princeton University Press.

[3] T. Haavelmo, 1943. The statistical implications of a system of simultaneous equations. Econometrica, 11:1–12.

[4] S. Magliacane, T. van Ommen, T. Claassen, S. Bongers, P. Versteeg, and J. M. Mooij, 2017. Domain adaptation by using causal inference to predict invariant conditional distributions. In Neural Information Processing Systems.

[5] J. Peters, P. Bühlmann, and N. Meinshausen, 2016. Causal inference using invariant prediction: identification and confidence interval (with discussion). J. Royal Statistical Society, Series B, 78:947–1012.

[6] M. Rojas-Carulla, B. Schölkopf, R. E. Turner, and J. Peters, 2018. Invariant models for causal transfer learning. Journal of Machine Learning Research, 19:36:1–36:34.

[7] D. Rothenhäusler, N. Meinshausen, P. Bühlmann, and J. Peters, 2021. Anchor regression: Heterogeneous data meet causality. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83(2):215–246.

[8] X. Shen, P. Bühlmann, and A. Taeb, 2023. Causality-oriented robustness: exploiting general additive interventions. arXiv preprint arXiv:2307.10299.

[9] K. Zhang, B. Schölkopf, K. Muandet, and Z. Wang, 2013. Domain adaptation under target and conditional shift. In International Conference on Machine Learning, pp 819–827. PMLR.