Jianqing Fan is Frederick L. Moore Professor at Princeton University, where he directs labs in financial econometrics and statistics. He earned his PhD from UC Berkeley and previously held academic positions at UNC–Chapel Hill, UCLA, and the Chinese University of Hong Kong. A former president of both the Institute of Mathematical Statistics and the International Chinese Statistical Association, he has served as editor for leading journals, including JASA, Annals of Statistics, Probability Theory and Related Fields, Journal of Business and Economics Statistics and Journal of Econometrics. His research spans statistics, machine learning, financial economics, and computational biology, with over 300 highly cited publications and four books. Recognized with numerous honors—including the COPSS Presidents’ Award, Guggenheim Fellowship, Guy Medal, Noether Distinguished Scholar Award, and Le Cam Award and Lecture —he is a fellow of multiple scientific societies and an elected member of Academia Sinica and the Royal Academy of Belgium. His current interests include high-dimensional statistics and AI. These two Wald Lectures will be delivered at JSM in Nashville, USA, August 2–7, 2025: https://ww2.amstat.org/meetings/jsm/2025/index.cfm.
The 2025 Wald Memorial Lectures
The Wald Lectures this year consist of two talks. The first talk is on pursuing causality from heterogeneous environment data using neural networks and adversarial learning. The second one is on aggregating rankings from crowds of referees across thousands of papers to improve the quality of reviews.
Wald Lecture I:
Neural Causality Learning from Multiple Environments
Causality discovery algorithmically from data is critical in scientific research, treatment intervention, and transfer learning. This requires us to reliably identify a subset of variables that invariantly influence the outcome across multiple environments in which data are collected. This problem is fundamentally challenging due to the presence of endogenous variables that cause regression functions to change over heterogeneous environments and the unknown form of regression functions. Taking the cow-camel image classification as an example, the background color is an endogenous variable that causes the prediction function to vary depending on the percentage of cows standing on green grass and camels on the sand. Yet, most statistical methods, such as Lasso and SCAD, deal only with exogenous spurious variables such as the time or temperature that the photos are taken. Spurious correlations—exacerbated by the “curse of endogeneity”—lead conventional regression techniques to produce unstable predictions and misleading causal attributions. Recognizing that causal mechanisms remain constant despite distributional shifts, this talk proposes a novel algorithmic framework that leverages this invariance principle to eliminate endogenous spurious variables so that conventional methods continue to apply.
The method introduced is the Focused Adversarial Invariance Regularization (FAIR) framework. FAIR reformulates the problem as a minimax optimization task, where the goal is to drive the regression model toward solutions that yield invariant predictions across multiple environments. Specifically, the framework minimizes a risk loss over a class of prediction functions while simultaneously maximizing an adversarial penalty that tests for exogeneity over the variables used. This adversarial approach ensures that the regression function is steered away from incorporating endogenously spurious variables. To realize this concept in a flexible and scalable manner, neural networks (FAIR-NN) are used as regression functions and test functions that capitalize on the expressive power of deep learning. The optimization is facilitated by a novel Gumbel approximation strategy, which eases the computational burden of handling the combinatorial “focused” constraint, and a gradient descent-ascent algorithm that efficiently solves the resulting minimax problem.
The theoretical analysis establishes that under a minimal identification condition tied to the heterogeneity of environments, FAIR-NN not only recovers the invariant regression function with optimal sample complexity but also adapts to unknown low-dimensional structures in the regression function. In a non-asymptotic setting, the paper provides oracle-type inequalities that quantify the estimation error and show that the estimator performs as well as if the true causal variable set were known in advance. Furthermore, within the framework of a structural causal model, the identified variables can be interpreted as “pragmatic direct causes” of the outcome.
Empirical evaluations on both simulated and real datasets support the theoretical findings, confirming that FAIR-NN effectively screens out spurious associations while retaining the causal signal. Overall, this work offers a unified, sample-efficient, and theoretically grounded approach for causality pursuit in complex, heterogeneous settings, bridging the gap between invariant risk minimization and classical causal discovery methods.
Wald Lecture II:
Improving Peer Review: Aggregation of Rankings from Crowds of Referees
The second talk is on aggregating rankings from sparse but general multiple comparisons. Machine Learning and AI conferences receive over ten thousand submissions, and this burdens the referee system significantly and impacts the quality of reviews with huge individual noises. It is reported that about half of the accepted papers in NeurIPS 2021 would be rejected upon a second round of reviews. This talk aims to develop a statistical framework that aggregates the preferences of tens of thousands of reviewers to come up with a better assessment of the quality of submitted papers. Specifically, each referee provides rankings among the papers she reviews. These rankings give information on the preference scores or quality of papers under comparisons through the commonly used Bradley–Terry–Luce (BTL) type of models and can be aggregated through a spectral method. Theoretically, we study the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores in a general setup in the comparison graph consisting of hyper-edges of possible heterogeneous sizes. In the scenarios where the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE) and discover that a two-step spectral method, applying the optimal weighting estimated from the vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Furthermore, we also introduce a comprehensive framework to carry out both one-sample and two-sample ranking inferences. Finally, we substantiate our findings via comprehensive numerical simulations and statistical inferences for rankings of statistical journals and movies.