Institute of Mathematical Statistics | Radu’s Ride: Theoretically, we could be more practical

Radu’s Ride: Theoretically, we could be more practical

July 18, 2022

Radu V. Craiu writes another “Radu’s Ride”:

After more than twenty years of looking at the faces of puzzled students, I sometimes wonder where have these (many!) undergraduates ended up working, and whether they’re still confused about statistics. This is a more serious thought than you’d think, because we wrap our students in the heavy mantle of hopeful anticipation and send them into the world and… well, are they delivering? Clearly, the society has data coming out of its metaphorical ears, and there is a high expectation for the data science “people” to do something about it. So, who are these people? And where do they work? In the topsy-turvy data science ecosystem, the tables between academia and industry have shifted, if not completely turned.

For instance, the sources of data-related problems have changed considerably. Starting in the late ’90s, there was a clear sense that tech companies were becoming an important source of statistical problems, while financial institutions, insurance, and pharmaceutical companies continued to do most of the industry hiring. This trend continued to accelerate at the beginning of the new millennium (remember the Netflix challenge?) and has reached gargantuan proportions today, to the point where a fair number of mid-career statisticians are leaving academia to work in the tech, bio, or finance sector. More senior statisticians, usually prominent ones, do not leave academia but spent an increasing amount of time consulting with companies and solving their problems. Graduate students leave universities with their advanced degrees and, increasingly, choose industry positions over academic ones. The sparsity of the latter and the extravagant salaries of the former are often cited as the most likely reasons. Perhaps more subtly, the lifestyle gap that was triggered by these two very broadly defined career paths, has gotten smaller, especially when considering the tech sector. Flexible hours, interesting problems and great benefits used to be highly appreciated appurtenances of an academic career; many of them are now within reach for Google, Amazon or Uber employees. Unlike the investment bank or big pharma jobs of yesteryear, the technology ones offer a wide range of problems to work on as well, thus chipping away at one of the last defenses of an academic career: its independence.

Not unexpectedly, many university administrators look at these trends with a worried eye, concerned about the brain drain and the contamination of our intellectual pursuits with more pedestrian interests. Doomsayers are predicting the imminent end of higher education, of course. I believe the alarms have gone off way too early, and on the academic side of the equation the tendency to “dig in our heels” might turn counterproductive. Rather than panic, perhaps we could work more with our students to ensure that their contributions to the discipline and society are valuable, regardless of the path they choose to pursue. After all, the goal of an educational program should be to produce solid thinkers that can be adaptive and prolific in many environments, not just the academic one. And the alternatives to the latter are multitudes, very different in character and impact.

In many conversations about careers, the government path is often omitted altogether, particularly in North America. This is unfortunate, since the government remains one of the biggest producers and users of data in any society. If you do not believe me, maybe you will believe Michael Lewis, the author who made number crunchers look sexy in Moneyball, and who has written about the US government’s wide range of essential and life-preserving activities in his book, Fifth Risk. Others understood the call early on. For instance, the newly created Data Science Institute at the University of Chicago has announced a partnership between the University of Chicago and City Colleges of Chicago (CCC), to germinate the “next generation of data science educators and broaden participation in this rapidly growing new field, building an inclusive, scalable model for expanding STEM education and careers nationwide.”

A bidirectional transfer of knowledge from universities to industry and government is beneficial to all partners. For one, academic research can be enriched with a stream of new problems and, perhaps, a fresh perspective on finding practical solutions. While it can be tempting to ignore contemporary trends—some might even call them fads—it does not mean that we must work only on problems that are invariant to time travel. Otherwise, we will continue to see low investment in our people and ideas, thus converting many a talented academic statistician from data science stakeholder into frustrated bystander.

I hasten to add that this reluctance is not at all mirrored by our biggest partner in the data science ecosystem, computer science. This large field of study has multiple research threads, some enmeshed firmly with the purely mathematical side of the discipline, and others that share a large swath of intellectual domain with statistics, such as data visualization and ethics, machine learning, optimization algorithms, data privacy, etc. It is among academics in the latter groups where one notices a stronger desire to engage with the real-world applications, be they in genetics, medicine, entertainment, finance, real estate, or a large and diverse range of engineering companies. While we “resist” the corruption of our research agenda with problems that deviate from the classical paradigm, our esteemed colleagues pull up their sleeves and jump into the messiness of data that are dark or corrupted, using models that are heavily computational, somewhat empirical perhaps… oh well, you know the drill. It has been played on repeat for the last decade, since Data Science emerged as a field of interest to many.

I, for one, am planning to answer more often one of those emails asking for help from one intrepid company or another. The few times I answered such a call, I found myself in the company of people with whom I shared the frustration of dealing with a puzzle, be it an unyielding computational problem or data that abscond with the truth. I left each such encounter with a renewed sense of purpose and feelings of solidarity.

These days, that’s an unexpected gift.

—

Radu Craiu also reported on two recent meetings in Toronto:

Turning statisticians into BFF-ers

The month of May has been a happening one for the Department of Statistical Sciences (DSS) at the University of Toronto. We started strong by hosting in our new space the Seventh Bayesian, Frequentist and Fiducial conference on May 2–4, 2022. The event had been originally scheduled to take place in May 2020 and was delayed because of the COVID pandemic.

The BFF series has traditionally focused on the foundations of statistics, placing emphasis on the three paradigms that have historically been at the center of our discipline. As stated on the BFF official website, bff-stat.org, “The Bayesian, Fiducial, and Frequentist (BFF) community began in 2014 as a means to facilitate scientific exchange among statisticians and scholars in related fields that develop new methodologies linked to the foundational principles of statistical inference. The community encourages and promotes research activities to bridge foundations for statistical inferences, to facilitate objective and replicable scientific learning, and to develop analytic and computing methodologies for data analysis.”

This year’s edition kept with tradition but also added computational and philosophical considerations into the mix. Sessions spanned a wide range of topics that included foundational and philosophical debates, nonstandard analysis for advancing BFF, methods to address the reproducibility crisis, information fusion, likelihood free computation and inference, and methods for large and complex data.

The full schedule is available at bit.ly/BFF7Conference

This year’s BFF was dedicated to the memory of Donald S. Fraser and the program included a session, organized by Nancy Reid, where the speakers were Don’s former collaborators, Mylène Bédard and Ana-Maria Staicu, as well as Todd Kuffner who presented an inspired review of Don’s research interests and their evolution in time.

The conference was organized in hybrid format, and the organizers were relieved to see that 40 out of 130 participants came to Toronto in person. This injected the proceedings with a great deal of energy, leading to vigorous question periods and productive coffee break discussions.

BFF7 was immediately followed on May 5 by “Statistics at Its Best,” a one-day conference organized by Radu Craiu and Grace Yi in honour of Nancy Reid’s 70th birthday. The speakers in the latter event included Alessandra Brazzale, Alicia Carriquiry, Anthony Davison, Christian Genest, Ed George, Rob Kass, Andrew McCormack, Mary Thompson and Jim Zidek. The conference banquet was attended by over 80 people and featured speeches from Jim Berger, Radu Craiu, Charmaine Dean, Don Estep (in absentia), Donelle and Ailie Fraser, Ed George, Peter McCullagh, James Stafford, Lisa Strug, Mary Thompson (in absentia), and Grace Yi.

Both events were organized in collaboration with Gravity Pull and funded in part by Toronto’s DSS and CANSSI. They were among this year’s scientific activities celebrating the 50th anniversary of the Statistical Society of Canada.

Please have a look at some photos of this year’s BFF (https://bit.ly/3HMHbuz) and “Statistics at Its Best” (https://bit.ly/3HTxevG).