Alison Etheridge gave this Presidential Address at the IMS annual meeting in Vilnius in July 2018.
So, how was I going to write a Presidential address? The best way to start seemed to be to see what my predecessors had said, and so I took to the internet. There I found a series of articulate and thoughtful pieces, each addressing issues of importance to our profession, with, understandably, a particular recent emphasis on data science. I was left more than a little intimidated. I toyed with trying to convince the authors that plagiarism was the highest form of flattery, but decided that this was not an appropriate solution, and so, instead, what follows is a somewhat more personal reflection on statistics, probability and, of course, the IMS.
Nonetheless, I shall draw upon the 2015 address of Erwin Bolthausen. Erwin opened with a question:
If one opens any scientific work about a topic where statistics plays a role, there are usually probabilistic concepts behind. How does it then come that probability theory and statistics, in research, have become more and more separated? The answer is to some extent evident:
Probability theory has nowadays many relations with other mathematical fields, and also with applied fields outside statistics.
For modern statistics, probability is just one crucial basis, but there are many more, often also non-mathematical ones. For instance, one has to decide which probabilistic models lead to computational feasible procedures, and still mirror the reality close enough. This cannot by answered by probability theory.
He then goes on to remark on the lack of statistical training amongst probabilists, at least in Europe, and to say that:
I think that in the modern development of probability, the relations with pure mathematics and with mathematical physics have become stronger than those with statistics.
I would perhaps put a slightly different spin (a pun for those who have read Erwin’s excellent summary of some of the most spectacular developments of modern probability theory) on the situation. And, of course, given my own research interests, I have to mention the tremendous impact that biology has had on all of the mathematical (including statistical) sciences.
It is true that both statistics and probability have grown at an unprecedented rate in recent years—we are told that this is the golden age of statistics; equally, we see probabilistic approaches underpinning huge swathes of modern mathematics. And, conversely, statistics and probability are calling upon a broader and broader range of techniques from the rest of the mathematical sciences—for example, one needs a far greater understanding of abstract algebra than mine to really come to grips with rough paths (or more generally, Martin Hairer’s regularity structures). In turn, the father of rough paths, Terry Lyons, now spends a large part of his time at the Alan Turing Institute in London (the UK’s National Institute for Data Science and Artificial Intelligence), developing applications of rough paths to data science.
The importance of statistical and probabilistic techniques across science has never been more widely recognised. I thought that I’d share a quote, which I learned from Sebastian Schreiber, from the Department of Ecology at UC Davis. It is from the English poet and dramatist John Gay (1685–1732):
Lest men suspect your tale untrue, Keep probability in view.
I confess that including this here is a little gratuitous, but I rather like it. (I’m also trying to match Erwin’s earliest citation. I’m failing of course: Erwin referenced Ars Conjectandi, published in 1713, but this quote is from “The painter who pleased everybody and nobody”, Fable XVIII in Gay’s first book of fables, published in 1727.) Here’s the whole of the first verse:
Lest men suspect your tale untrue, Keep probability in view.
The traveller leaping o’er those bounds, The credit of his book confounds.
Who with his tongue hath armies routed, Makes even his real courage doubted:
But flattery never seems absurd; The flattered always take your word:
Impossibilities seem just; They take the strongest praise on trust.
It seems to me that, despite when he lived, Gay had the makings of a statistician as well as a politician. And perhaps there is a lesson for data science there. Gay tells us that if we make unrealistic claims, people won’t even believe in the results and applications that we really do have. Certainly a very large part of the UK Industrial Strategy seems to be founded upon the claim that data science (and perhaps more especially AI and Machine Learning) is the solution to all our economic and social woes; but if we are really to rely upon data science as an underpinning technology in our daily lives, so that those lives very literally depend upon it, then it must maintain its scientific integrity: “Keep probability in view”.
My point is that I don’t think that it is a problem that each and every one of us is part of a huge intellectual landscape, of which we only really understand some very small part. In fact, we have been in this boat for a long time. Science is a massive continuum; gone are the days of renaissance (usually) man, mastering all of natural philosophy (and probably writing a few poems on the side). There is simply too much of it. As long as we can communicate with one another, we can combine our expertise, often to stunning effect.
When I was a young Research Fellow in Oxford, in the College where the great mathematician G.H. Hardy had been a Professor, other Fellows were a little too fond of reminding me that Hardy had believed that mathematicians did all their great work before the age of forty.
Young men should prove theorems, old men should write books.
I found this rather depressing—but a distinguished Professor, already considerably beyond his best-before date if we were to believe Hardy, explained to me that yes, mathematical scientists are supposed to be at the height of their computational powers when young, but there is also a “knowledge” distribution—as we grow older we know more and more (I confess that I am no longer convinced that this increases indefinitely); our power as scientists is a convolution of these distributions. In fact, these days many of us work in teams; we can combine the breathtaking speed and skill of our students with the knowledge and experience of scientists with complementary expertise. Some of our colleagues, especially at the more applied end of the field, are involved in huge collaborations. They are just one part of a complex picture—but a vital part.
We also have to be open to the idea that work of no immediate obvious practical benefit may still be of lasting importance. Hardy, of course, believed that mathematics would never be of any practical use and indeed he took pride in this:
No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world.
He was wrong, of course; his work has been of tremendous practical importance. But he also knew about teams (not just through his cricket team, “the Mathematicals”). He collaborated widely and, especially impressive for the time, internationally. His network included Pólya, Cramér, Wiener. I wonder how he would have felt to learn that probably more people know his name through the Hardy–Weinberg equilibrium than through his profound contributions to analysis and number theory.
Teamwork requires a team. That in turn requires the networks that Hardy so successfully cultivated. And I claim that here the IMS has two important roles to play. The first, most obvious, is to facilitate and cultivate those networks (at least within statistical science). This is the purpose of IMS groups. They have suffered varied fates, but what is clear is that the responsibility for running a group cannot rest on a single person’s shoulders indefinitely—without some sort of scaffold, the whole thing lasts for as long as the energy of its founder. So when Sofia Olhede and Patrick Wolfe offered to take over the Data Science Group last year, we agreed that we would put in place a minimal governance structure, borrowed to some extent from that of the highly successful New Researchers Group. If, as I hope, the group flourishes, then their terms of reference can provide a loose template for others.
Crucially, the Data Science Group has an executive committee that not only reflects the geographic diversity of the IMS, but also the immense range of interests overlapping data science, including high-dimensional statistics, biostatistics, algebraic statistics, Bayesian methods, big observational data, probabilistic methods, and post-secondary education in data science. We really want this to succeed, and I’d like to thank Sofia and Patrick for everything that they have done so far—anyone who has put together a scientific program committee will know just how hard it is to achieve subject, geographic, and gender diversity, but with their executive committee they have managed just that. It has taken a while for us to get this far, but the Data Science Group has an email address [datascience@imstat.org] and a website [http://groups.imstat.org/datascience/], and a session set up at the JSM in Vancouver; and much more is planned for next year.
Groups are only one way in which the IMS can help. The pressure on less established (and possibly also more established) researchers is to publish large numbers of papers, which in turn incentivizes a very narrow and specialized view. This can lead to a very distorted view of the world. Another distinguished Oxford mathematician had a very nice allegory of this. Imagine you are in a dark room. You turn on a lamp at your desk and the objects that are illuminated suddenly seem large and important. If I move the light and shine it somewhere else, other objects take on more importance. When someone turns on the overhead light, I see that everything is equally important (and that what I was looking for was not under my light at all). Although most of us will always be rather specialized, we need to be able to stand back and take a broader view.
Of course, the annual IMS New Researchers Conference is a great contribution here—people from very different backgrounds meet and exchange scientific ideas, as well as tips on how to navigate the treacherous waters of academia, in a relaxed environment. But this only reaches a relatively small number of people. Another extremely important contribution is through our scientific programmes and, in particular, the special lectures. The IMS Medallion and named lectures (some in collaboration with the Bernoulli society) provide an opportunity to showcase, at an accessible level, some of the most exciting research from across our discipline. More thanks are certainly in order here, both to the New Researchers Group, for inviting me along to their meeting again this year, and to the Committee on Special Lectures, who have once again put together a phenomenal slate of lectures.
I want to say a little more about governance. I have spent much of my academic life in Oxford. If anyone wants to learn about the ways in which an Oxbridge [i.e. Oxford or Cambridge] College is governed, they should take a look at Microcosmographia academica, written by Francis Cornford, the husband of Charles Darwin’s daughter Frances. It is a satire on university politics which came out in 1908, but in places it still rings very true. Cornford is very good at drawing out the reasons for doing nothing. I particularly like the Principle of Unripe Time: the argument is that although a particular action should be taken, now is not quite the right moment to take it. He goes on to say,
Time, by the way, is like the medlar [a fruit]; it has the trick of going rotten before it is ripe.
The extraordinarily long institutional memories in Oxbridge Colleges, combined with the application of Cornford’s principles of academic government, can make it extremely difficult to elicit change. The IMS has almost the opposite problem. For us, institutional memory is rather short. We (rightly) have three year terms of office on essentially all our committees, and because we are spread across the globe, new members of committees don’t routinely run into old-timers and gossip at the water cooler.
Here, I want to point to a specific issue. We have just announced the election of a list of twenty fantastic new Fellows. Every single one of them is an outstanding scientist and deserves their election. However, there are only two women among them. This is not the fault of the Committee on Fellows, who performed their charge admirably; only two women were in fact nominated. This prompted us to take a look at nominations over a longer period and we see a cyclical trend. It seems to be that numbers fall, action is taken (probably by an individual), numbers increase, things look fine, everyone relaxes, numbers fall.
I have never thought of myself as a diversity champion, and don’t find it to be a particularly comfortable role. Indeed, I find the issues to be extremely difficult. But in 1867 the British philosopher and political theorist John Stuart Mill delivered an inaugural address at the University of St Andrews in which he said,
Let not any one pacify his conscience by the delusion that he can do no harm if he takes no part, and forms no opinion.
He goes on to say something similar to a dictum often (incorrectly it seems) attributed to the British statesman Edmund Burke around 1800:
All that is required for the triumph of evil is that good men do nothing.
Evil is a strong word. Even Cornford only regarded women as the second most dangerous threat to the young academic; young men in a hurry (those data scientists not paying heed to the statistical underpinnings, perhaps?) came in at number one—but it is certainly wrong that more women are not being nominated. Burke did say something relevant:
No man, who is not inflamed by vain-glory into enthusiasm, can flatter himself that his single, unsupported, desultory, unsystematic endeavours are of power to defeat [evil].
So I am certainly, rather belatedly, willing to step forward and do my bit. But I can’t do it alone and I have been trying to get some structures in place to help us to break the cycle. Council has agreed guidelines on unconscious bias and conflicts of interest for all our committees, and to the creation of a diversity committee. But we all need to be much more proactive, especially in seeking nominations.
I hope that an allegory of my own will prove appropriate. There is a very famous bridge [pictured above] in Cambridge, England, often called “the mathematical bridge.” It was designed not, as many believe, by Newton, but by William Etheridge (a distant relative) in 1748, and built by James Essex in 1749. The design is based on that of the wooden structures used at the time as supports upon which to build stone bridges. Etheridge himself was a carpenter who worked on the building of the first bridge to cross the River Thames at Westminster in London. He is credited with inventing an underwater saw to cut through the piles that ran down into the riverbed, so that once the stone bridge was complete, the wooden supporting structure could be lowered into the water and floated away. Maybe one day we’ll be able to allow the diversity bridge to float down the river…
Diversity means much more than gender. Diversity is ethnicity, geography, discipline, … . Our strength as a society comes from that diversity. It is not without its challenges, but equally there is no doubting the rewards. It is also not just about diversity among the Fellows and across our scientific programme; we need diversity across our membership—the broadening of the membership is lagging behind the broadening of our journals. Part of the issue in Europe is the division between mathematics and statistics in our degree structures and training. Erwin pointed to the lack of statistical training among European probabilists and I think this leads to a lack of proper appreciation and respect. I’m sure that when I was a graduate student I would have taken much less pride in my own intellectual “family tree” than I do now: I was supervised by David Edwards, who led the functional analysis group in Oxford. He was supervised by David Kendall, who was supervised by Maurice Bartlett, and continuing backwards we have John Wishart, Karl Pearson and Sir Francis Galton. Not a bad line-up. In fact, I only discovered this fairly recently and realised that, far from being rebellious and striking out on my own when I left Oxford and functional analysis behind, in this company I was merely reverting to type. I have heard some question the relevance of the IMS to probabilists (although I notice that those asking routinely publish in our journals). I am still far from being a statistician—most of my work is either concerned with infinite-dimensional stochastic processes or population genetics (or more likely both) and I am still acutely aware of my lamentable lack of knowledge of statistics—but that doesn’t reduce the relevance of the IMS to me. In fact, I had a quick scan of the papers I’ve written in the last couple of years and realised that I had used results of at least six past-Presidents, in the last two years, not to mention the person who talked me into accepting the role of President, and several of our new Fellows. And, yes, I try to publish in our journals, because they are simply the best.
I’d like to reiterate to non-members that by joining the IMS, one is in a position to help shape the scientific programme, the future of our journals, and the contribution to the profession of this great society.
Fortunately, I didn’t set out with grand aims when I accepted the gavel from Jon just under a year ago. I hope that some of the things that have been initiated will reach fruition over the next year. Of course, Presidents don’t do the real work, all they do is set up an ad hoc committee and issue them with a charge, and I would like to say a huge thank you to all those colleagues who responded so generously to my requests for input, especially the very large number who I have only ever ‘e-met’, and so have never been able to thank in person.
As I come to the end of my term as President, I still have an immense sense of pride in the IMS. And so I’ll end by plagiarising my own piece in the Bulletin a year ago: The IMS is a badge of academic quality; we publish outstanding journals and our Committee on Special Lectures have once again excelled themselves in their contribution to the scientific programme. But most of all, the IMS is a community of scholars that supports and nourishes talent from right across the spectrum of our discipline. Long may it thrive!
Comments on “2018 Presidential Address: Reflections on probability, statistics and the IMS”