Bin Yu is Chancellor’s Distinguished Professor and Class of 1936 Second Chair in Statistics, EECS, and Computational Biology at UC Berkeley. Her research focuses on statistical machine learning practice and theory and interdisciplinary data problems in neuroscience, genomics, and precision medicine. She and her team developed in context iterative random forests (iRF), hierarchical shrinkage (HS) for decision trees, Fast Interpretable Greedy-Tree Sums (FIGS), stability-driven NMF (staNMF), and adaptive wavelet distillation (AWD) from deep learning models. She is a member of the National Academy of Sciences and American Academy of Arts and Sciences. She was a Guggenheim Fellow, Tukey Memorial Lecturer of the Bernoulli Society, IMS Rietz Lecturer, and COPSS E. L. Scott Awardee. She holds an Honorary Doctorate from The University of Lausanne. She served on the inaugural scientific advisory board of the UK Turing Institute of Data Science and AI and is currently on the editorial board of PNAS.

These two IMS Wald Lectures will be given at the Joint Statistical Meetings in Toronto, August 5–10, 2023.

Wald Lecture 1:
Seeking Boolean interactions in biomedicine and proofs

Thresholding or Boolean behaviors of biomolecules underlie many biological processes. Decision-trees capture such behaviors and tree-based methods such random forests have been shown to succeed in predictive tasks in genomics and medicine.

In this talk, we use UK Biobank data and a stable version of the random forests, iterative random forests (iRF), to recommend gene and gene–gene interactions that have data evidence for driving a heart disease, Hypertrophic Cardiomyopathy (HCM).

Four out of the five recommendations are shown to be causal for HCM in follow-up gene-silencing experiments. This and other empirical successes of iRF motivate a theoretical investigation of its tractable version under a new local sparse and spiky (LSS) model where the regression function is a linear combination of Boolean interactions of features. The tractable version of iRF is shown to be model selection consistent under this new model, under the conditions of feature independence and non-overlap of interactions.

Wald Lecture 2:
Sparse dictionary learning and deep learning in practice and theory

Sparse dictionary learning has a long history and produces wavelet-like filters when fed with natural image patches, corresponding to the V1 primary visual cortex of the human brain. Wavelets as local Fourier Transforms

are interpretable in physical sciences and beyond. In this talk, we will first describe adaptive wavelet distillation (AWD) to turn black-box deep learning models interpretable in cosmology and cellular biology problems while improving predictive performance. Then we present theoretical results that, under simple sparse dictionary models, gradient descent in auto-encoder fitting converges to one point on a manifold of global minima, and which minimum depends on the batch size. In particular, we show that when using a small batch-size as in stochastic gradient descent (SGD) a qualitatively different type of “feature selection” occurs.

 

[Editor’s note: Bin Yu will also be delivering the COPSS Distinguished Achievement Award Lecture at JSM this year. See article here.]