The IMS is pleased to announce the creation of a new award and lecture, to honor Grace Wahba’s monumental contributions to statistics and science. These include her pioneering and influential work in mathematical statistics, machine learning and optimization; broad and career-long interdisciplinary collaborations that have had a significant impact in epidemiology, bioinformatics and climate sciences; and outstanding mentoring during her 51-year career as an educator. The inaugural Wahba award and lecture is planned for the 2022 IMS annual meeting in London, then annually at JSM.
Grace Wahba is one of the outstanding statisticians of our age. She has transformed the fields of mathematical statistics and applied statistics. Wahba’s RKHS theory plays a central role in nonparametric smoothing and splines, and its importance is widely recognized. In addition, Wahba’s contributions straddle the boundary between statistics and optimization, and have led to fundamental breakthroughs in machine learning for solving problems arising in prediction, classification and cross-validation. She has paved a foundation for connecting theory and practice of function estimation, and has developed, along with her students, unified estimation methods, scalable algorithms and standard software toolboxes that have made regularization approaches widely applicable to solving complex problems in modern science discovery and technology innovation. Wahba is listed as a Highly Cited Researcher in Mathematics by ISIHighlyCited.com; her work has received more than 50,000 citations, according to Google Scholar.
Over the years, Wahba pursued her passion for research driven by real problems, and let natural curiosity be her guide. This led to a groundbreaking career in statistics at a time when the odds of a woman earning international acclaim in the field were slim, to say the least.
In establishing the IMS Grace Wahba Award and Lecture, the IMS further affirms its commitment to honoring outstanding statisticians, regardless of gender, to supporting diversity in statistical science and in its membership, and to inspiring statisticians, mathematicians, computer scientists and scientific researchers in general. The lecture will be a highlight at IMS meetings, and an intellectually inspiring presentation with broad appeal to a large audience.
The committee members Jianqing Fan, Sunduz Keles, Douglas Nychka, Bernard Silverman, Daniela Witten, Wing H. Wong, Bin Yu, Ming Yuan, and Hao Helen Zhang, are working on the details of the award — including raising the endowment fund (to which you can contribute via https://imstat.org/shop/donation/). Updates will be forthcoming.
In the meantime, you can read more about Grace Wahba’s work and life below.
Grace Wahba is an American statistician and emerita I.J. Schoenberg-Hilldale Professor of Statistics at the University of Wisconsin–Madison. She has made fundamental contributions to mathematical and applied statistics, optimization, and machine learning. In particular, she laid the foundations for smoothing noisy data and is regarded as “the mother of smoothing splines.” Her work ingeniously integrated advanced mathematics such as reproducing kernel Hilbert spaces (RKHS), state-of-the-art tuning criterion such as generalized cross validation (GCV), and powerful computational methods based on the representer theorem into a broad class of regularization approaches. These approaches, now field standards, profoundly impacted practical applications in science, engineering, and medicine.
Grace Wahba’s career overview
Grace Wahba obtained her bachelor’s degree in mathematics from Cornell University in 1956, her master’s in mathematics from the University of Maryland in 1962, and her PhD in statistics from Stanford University in 1966. She joined the University of Wisconsin at Madison in 1967 as the first female faculty member in the Department of Statistics, and remained on the faculty for 51 years, retiring in 2018.
Grace Wahba is an outstanding statistician. She has had a foundational influence on the fields of mathematical and applied statistics, optimization and numerical computation, and machine learning. Throughout her career, Wahba collaborated actively and broadly in many interdisciplinary areas, emphasizing the importance of understanding the scientific context of data before developing and applying new statistical methods, and has made seminal contributions to climatology, epidemiology and bioinformatics.
Contributions in Statistics
Grace’s early work focused on spline models for noisy observational data. When she joined UW–Madison, she had a half-time appointment at the Math Research Center (MRC), which was known as a stronghold for splines and approximation theory, with research led by world-leading spline experts Iso Schoenberg, Carl de Boor and Larry Schumaker. The group was interested in solving the spline smoothing problem, but lacked a unified framework for feasible and scalable optimization. In statistics, Parzen was already applying RKHS in time series analysis. However, Grace provided the breakthrough that connects RKHS to optimization and curve fitting on empirical data. Kimeldorf and Wahba (1971) were the first to formulate the cubic smoothing spline problem in the RKHS framework, and to derive the solution as the minimizer of a regularization problem. Kimeldorf and Wahba proved the famous “Representer Theorem” using Euclidean geometry, showing how to find a function in an infinite dimensional RKHS given noisy values of a finite number of bounded linear functionals. The theorem has become the foundational principle for theoretical investigation and practical implementation of cubic smoothing splines, as well as a whole menagerie of penalized likelihood problems and regularization methods. Theoretically, this result sheds light on many corners of mathematical statistics, particularly asymptotics for nonparametric statistics. Computationally, the theorem had far-reaching applications — from the 1970s, when solving a 10×10 linear system was just barely doable, to today’s massive high dimensional datasets. In recent decades, the focus of much theoretical research has shifted from parametric statistics (estimating finite-dimensional models) to nonparametric statistics (estimating infinite-dimensional models such as curves and surfaces). Wahba’s RKHS continues to grow in importance as one of the most elegant, flexible and reliable theoretical tools for understanding efficient estimation of high-dimensional smooth functions.
Another important challenge in applying a spline in practice or more generally, using regularization methods based on optimization in an RKHS, is the choice of one (or more) so-called smoothing-parameter, which balances off the fit to the data and the RKHS squared norm of the function being estimated. For a long time, without supercomputing power, this was done in an ad hoc way, and the results were less than satisfactory. With Svante Wold (1975), Wahba first proposed the automatic tuning procedure based on leaving-out-one cross validation. Later, Wahba provided a major breakthrough by realizing that “the cubic spline has the square integral of the second derivative as penalty and that is related to human perception of smoothness”, and by inventing generalized cross-validation (GCV, Golub, Heath, Wahba, Technometrics, 1979; Craven and Wahba, Numerische Mathematik, 1979). Wahba provided a deep analysis of GCV, proved its theoretical properties and developed efficient code for its implementation. GCV has become a standard method for tuning parameter selection, and has far-reaching applications in science and industry.
This ground-breaking work led to a series of fundamental papers, published by Wahba in the 1970s and 80s, which paved the way for smoothing splines and regularization methods in a broad range of topics including likelihood estimation, classification and density estimation.
Wahba’s book Spline Models for Observational Data (SIAM, 1990) is a classical text on smoothing splines. The scope of the book is phenomenal, covering a broad range of topics including time series, spline smoothing, nonparametric regression, likelihood estimation and density estimation. Wahba labored for decades to polish a viewpoint of RKHS and smoothing splines, which has proven to be exceptionally valuable and applicable. Her presentation provides the most ambitious construction of theoretical machinery for the RKHS in modern mathematical statistics. The book now has over 8,000 citations. Since the 1990s, with the ever-increasing size and complexity of big data, Wahba’s seminal work on the use of smoothing splines has become more important. The links between RKHS theory and high-dimensional optimization are widely used in statistical machine learning. They have led to a huge explosion of work on sparse optimization, along with other new approaches for tuning models.
During 1993–95, Wahba developed the Smoothing Spline ANOVA models, a unified and powerful way of generalizing ordinary ANOVA, which can be characterized as projections in the tensor product of several Euclidean spaces, to projections in tensor products of several RKHS, which allows for main effects and interactions between heterogeneous variables of all kinds. More recently, LASSO-type models have been broadly used for building sparse and interpretable models in high dimensional data analysis. Wahba also made insightful contributions to modern high dimensional tools. Leng, Lin and Wahba (2006) showed that if the LASSO is tuned for prediction, in general it is not consistent for model selection.
Contributions in Machine Learning
In addition to being a great statistician, Wahba is also a pioneer of the field of machine learning. In 2013, Wahba contributed an essay, “Statistical Model Building, Machine Learning, and the Ah-Ha Moment,” to the COPSS 50th Anniversary book Past, Present, and Future of Statistical Science, showing her deep insight and influential contributions on the intersection of statistics and machine learning. In the interview for her Pfizer Colloquium at the University of Connecticut in 2018, when asked for her perspective about the relationship between statistics and machine learning, Wahba answered “I think there is a great deal of overlap there and we should join them, not beat them. I expect my PhD students to get a minor or even a masters in CS, and I see CS students doing the reverse. I claim my area is Statistical Machine Learning”.
Wahba’s work on RKHS, regularization and GCV have made a fundamental impact on the machine learning community. In machine learning, support vector machines (SVMs) are a popular class of large-margin classifiers with wide applications and successful performance. However, for a long time, they were a black box to statisticians, as their large-sample properties were completely murky. Wahba’s work illuminated the connection between SVMs and RKHS. The magic moment occurred at a conference in Mt. Holyoke in 1996 where statisticians and computer scientists came together, and at that time, it was realized that SVMs could be obtained as an optimization problem in the RKHS framework by simply replacing the square loss function with the hinge loss. This important connection immediately ignited intensive interest in RKHS and the “Representer Theorem” within both the statistics and computer science communities, about 30 years after Wahba’s pioneering work in 1971. Since then, Wahba and her collaborators Lin, Lee, and Zhang developed extensive theory to uncover the mystery of SVMs, namely, the SVM is estimating the sign of the log odds ratio, exactly what you want for classification. Her group also extended the binary SVM framework to multiclass SVMs and unbalanced classification problems. These results provide a full and precise understanding of SVMs from a statistical perspective.
One common and challenging feature of modern massive datasets is high dimensionality: “large p, small n”. This poses computational challenges for nonparametric statistical methods and flexible learning algorithms. Wahba’s RKHS theory and Representer Theorem overcome this obstacle by providing an elegant framework to formulate more powerful classifiers such as support vector machines and sparse modeling, which implicitly project the observed data into a higher-dimensional feature space before building classification decision rules. Her work has substantially advanced the fields of sparse modeling, model selection and high dimensional data analysis. Her spline book introduced the RKHS machinery to the machine learning research community.
Wahba has had long-term, fruitful collaborations with machine learning researchers in CS, developing state-of-the-art algorithms for analyzing heterogeneous, complex data sets. In particular, her collaborations with Steve Wright in the Madison CS department resulted in providing crucial algorithms in a series of joint papers. Included is work on regularized logistic regression as a tool to find multiple dichotomous risk factors. Of particular interest is Lu, Keles, Wright and Wahba (PNAS, 2005) involving protein clustering based on sequence data — see more below.
Contributions in Interdisciplinary Scientific Research
In parallel to her mathematical research, Wahba has had a major impact on applied problems. She has worked on cutting-edge statistical problems that are motivated by applied science. Her research focuses on developing new and improved statistical models and machine learning methods to extract important information from demographic studies and clinical trials.
Wahba’s passion for solving real-world applications with mathematical and statistical tools can be traced back to 1965, when she was working at IBM. One key problem was how best to estimate the attitude of a satellite given star sensors and an ephemeris, a table of actual positions (direction cosines) of a set of stars. Mathematically, this amounts to finding a rotation matrix that best maps the observed direction cosines to the true direction cosines. Wahba first proposed the mathematical formulation of the problem, which is the well-known “Wahba’s Problem” in today’s applied mathematics. Since she published the problem in her SIAM Review paper “A Least Squares Estimate of Satellite Attitudes” (1965), the problem led to thousands of scientific papers on the problem including a number of solutions such as Davenport’s q-method, QUEST and SVD-based methods. Wahba’s problem and its solution have turned out to have important applications in satellite control using sensors such as multi-antenna GPS receivers and magnetometers.
Throughout her career, Wahba successfully established long-term collaborations with other scientific researchers on real-world problems. Among them were collaborations in meteorology and ophthalmology.
Wahba’s meteorology collaboration began with Don Johnson, the editor of the Monthly Weather Review. At the time, weather forecast models observed data from scattered weather stations and needed to smooth and interpolate the observations onto a computational grid, which were then used in a gridded forecast model. Wahba and then-student Jim Wendelberger improved this method significantly using her splendid thin-plate spline framework. Her work immediately ignited intense research interests and efforts in numerical weather prediction models. Wahba and her collaborators then developed a long series of results related to splines and GCV with applications in numerical weather prediction. For her outstanding contributions to the application of statistics in atmospheric and climate science, Wahba received the 1998 Statistical Climatology Achievement Award.
Wahba also was involved in a 24-year collaboration with two ophthalmologists, Drs. Ron and Barbara Klein from the Department of Ophthalmology and Visual Sciences at UW–Madison, on an epidemiological study of diabetic retinopathy. Since the 1980s, Wahba and her students joined the Kleins’ team to develop new statistical methods for discovering underlying mechanisms of age-related eye disease and diabetes. Most of the efforts focused on analyzing data from the Beaver Dam Eye Study. These fruitful collaborations resulted in important publications in ophthalmology and visual sciences. In one of their recent papers, they showed that mortality runs in families in parallel with modifiable risk factors such as smoking, BMI and socioeconomic variables. Wahba presented these results at various lectures, including her Neyman Memorial Lecture (1994) and the Wald Lectures (2003).
Later on, Wahba significantly expanded her research areas to bioinformatics and statistical genetics. Lu, Keles, Wright, and Wahba (PNAS 2005) developed a novel and effective optimization framework for kernel learning — Regularized Kernel Estimation (RKE) — and applied the results to clustering proteins based on pairwise Blast scores. This involved placing each sequence in a low-dimensional space so that distance between proteins in this representation reflects similarity. They achieved a visualizable 3D sequence space of globins and then a new protein’s function can be deduced based on its cluster membership. Later her group also used RKE to examine the relative influence of familial, genetic and environmental covariates in flexible risk models.
Such long-term collaborations between biologists and quantitative scientists are unusual, and attest to the significant impact of Wahba’s work on scientific investigations.
Educational Achievements
Wahba has been a great mentor for many outstanding statisticians. She is best known as the mother of UW–Madison’s spline school, and as the primary driving force for data smoothing methods, theory and applications. She led one of the most active and productive research groups on smoothing splines, sparse modeling and support vector machines. Wahba is passionate about working with students, as she said at the Pfizer Colloquium interview, “My biggest thrill as a professor is listening to a student explain their ideas.” She is renowned for her generosity in sharing ideas with students, collaborators and junior faculty, and for her unwavering support for junior faculty in promoting their careers and helping them through hardships.
During her spectacular 51-year career at UW–Madison, Wahba nurtured many talented statisticians, including 39 PhDs and nearly 370 academic descendants (according to the Mathematics Genealogy Project). Many of her PhD students have become successful researchers and senior leaders.
Awards and Honors
Wahba’s extraordinary scientific achievements have been recognized by many honors and awards. She is an Elected Member of the US National Academy of Sciences, the American Academy of Arts and Sciences, American Association for the Advancement of Science, and the International Statistical Institute, and a Fellow of IMS, SIAM and ASA.
For her outstanding contributions to the field of statistics and society, Wahba received a long list of prestigious awards, including the first Parzen Prize for Statistical Innovation in 1994, the 1996 COPSS Elizabeth Scott Award, the 2009 Gottfried Noether Senior Scholar Award, and the inaugural Leo Breiman Award in 2017 for her contributions to Statistical Machine Learning. In addition, Wahba delivered the 2003 Wald lectures, the 2004 COPSS Fisher Lecture, the ASA President’s Invited Address at JSM 2007, and the 2018 Pfizer Colloquium.