Bin Yu is Chancellor’s Professor in the Departments of Statistics and of Electrical Engineering & Computer Science at the University of California at Berkeley. Her current research interests focus on solving high-dimensional data problems through developments of statistics and machine learning methodologies, algorithms, and theory. Her group is engaged in interdisciplinary research with scientists from genomics, neuroscience, and medicine. She is a Member of the National Academy of Sciences and Fellow of the American Academy of Arts and Sciences. She was a Guggenheim Fellow in 2006, and President of IMS in 2013–14. She was an ICIAM Invited Lecturer in 2011 and Tukey Memorial Lecture of Bernoulli Society in 2011. She is a Fellow of IMS, ASA, IEEE and AAAS. She has served or is serving on numerous journal editorial boards, including JMLR, AOS and JASA, and committees of BMSA of NAS, SAMSI, IPAM and ICERM. Bin will deliver this Rietz Lecture at the World Congress on July 11.
Theory to gain insight and inform practice
Henry L. Rietz, the first president of IMS, published his book Mathematical Statistics in 1927. One reviewer wrote in 1928, “Professor Rietz has developed this theory so skillfully that the ‘workers in other fields’, provided only that they have a passing familiarity with the grammar of mathematics, can secure a satisfactory understanding of the points involved.”
In this lecture, I would like to promote the good tradition of mathematical statistics as expressed in Rietz’s book in order to gain insight and inform practice.
In particular, I will recount the beginning of our theoretical study of dictionary learning (DL) as part of a multi-disciplinary project to “map a cell’s destiny” in Drosophila embryo. I will share insights gained regarding local identifiability of primal and dual formulations of DL. Furthermore, comparing the two formulations is leading us down the path of seeking confidence measures of the learned dictionary elements (corresponding to biologically meaningful regions in Drosophila embryo).
Finally, I will present preliminary work using our confidence measures to identify potential knockout (or gene editing) experiments in an iterative interaction between biological and data sciences.