We introduce the first two in a series of IMS special lecture previews for 2018. Richard Samworth and Thomas Mikosch are two of this year’s Medallion Lecturers. Both of them will be giving their Lecture at the IMS Annual Meeting in Vilnius, Lithuania, July 2–6, 2018. The program will be announced soon. We’ll bring you more lecture previews in the next issues.


Medallion preview: Richard Samworth

Richard Samworth is the Professor of Statistical Science and Director of the Statistical Laboratory at the University of Cambridge. He obtained his PhD in Statistics, also from the University of Cambridge, in 2004. His main research interests are in nonparametric and high-dimensional statistical inference. Particular topics include nonparametric function estimation problems (including under shape constraints), nonparametric classification, high-dimensional variable selection and dimension reduction. Richard serves as an Associate Editor for the Annals of Statistics and Statistical Science, as well as the Journal of the American Statistical Association. He has been awarded the Adams Prize (2017, joint with Graham Cormode), a Leverhulme prize (2014), the Royal Statistical Society’s Guy Medal in Bronze (2012) and Research prize (2008), and is an ASA Fellow (2015) and IMS Fellow (2014). Richard’s Medallion Lecture will be given at the IMS Vilnius meeting, July 2–6, 2018.

Efficient entropy estimation, independence testing and more… all with k-nearest neighbour distances

Nearest neighbour methods are most commonly associated with classification problems, but in fact they are very flexible and can be applied in a wide variety of statistical tasks. They are conceptually simple, can be computed easily even in multivariate problems, and we will argue in this talk that they can lead to methods with very attractive statistical properties. Our main focus is on entropy estimation [1] and independence testing [2], though if time permits, we may discuss other applications.

It was the founding father of information theory, Claude Shannon, who recognised the importance as a measure of unpredictability of the density functional

$H(f) = -\int f \log f.$

The polymath John von Neumann advised him to call it “entropy” for two reasons: “In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one really knows what entropy really is, so in a debate you will always have the advantage”! In statistical contexts, it is often the estimation of entropy from a random sample that is of main interest, e.g. in goodness-of-fit tests of normality or uniformity, independent component analysis and feature selection in classification.

Kozachenko and Leonenko [3] proposed an intriguing closed-form estimator of entropy based on kth nearest neighbour distances; it also involves both the volume of the unit ball in d dimensions and the digamma function. Remarkably, under appropriate conditions, it turns out that a weighted generalisation of this estimator is efficient in arbitrary dimensions.

Testing independence and estimating dependence are well-
established areas of statistics, with the related idea of correlation dating back to Francis Galton’s 19th century work, which was subsequently expanded upon by Karl Pearson. Mutual information, a close cousin of entropy, characterises the dependence between two random vectors X and Y in a particularly convenient way. We can therefore adapt our entropy estimator to propose a new test of independence, which we call MINT, short for Mutual INformation Test. As well as having guaranteed nominal size, our test is powerful in the sense that it can detect alternatives whose mutual information is surprisingly small. We will also show how modifications of these ideas can be used to provide a new goodness-of-fit test for normal linear models.


[1] Berrett, T.B., Samworth, R.J. and Yuan, M. (2018) Efficient multivariate entropy estimation via k-nearest neighbour distances. Ann. Statist., to appear.

[2] Berrett, T.B. and Samworth, R.J. (2017) Nonparametric independence testing via mutual information.

[3] Kozachenko, L.F. and Leonenko, N.N. (1987) Sample estimate of the entropy of a random vector. Probl. Inform. Transm., 23, 95–101.



Medallion preview: Thomas Mikosch

Thomas Mikosch received his PhD in Probability Theory at the University of Leningrad (St. Petersburg) in 1984. He is Professor of Actuarial Science at the University of Copenhagen. His scientific interests are at the interface of applied probability and mathematical statistics. In particular, he is interested in heavy-tail phenomena, extreme value theory, time series analysis, and random matrix theory. He has published about 130 scientific articles and five books. Thomas is a member of the Bernoulli Society (BS), Danish Statistical Association, Danish Association of Actuaries, Danish Royal Society of Sciences and Letters, and is an IMS Fellow. He has (co-)organized numerous conferences, workshops and PhD schools. Currently, he is Associate Editor of various journals, Editor of Bernoulli and European Actuarial Journal, EiC of the Extremes Journal, and he was the EiC of Stochastic Processes and their Applications in 2009–2012. He is one of the editors of the Springer book series Operations Research and Financial Engineering. He has served on the Itô Prize Committee since 2009. In the BS he chairs the Publications Committee, is Publications Secretary and a member of the Executive Council. In 2018 he was awarded the Alexander von Humboldt Research Prize. Thomas will also deliver his Medallion Lecture at the IMS Vilnius meeting, July 2–6, 2018.

Regular variation and heavy-tail large deviations for time series

The goal of this lecture is to present some of the recent results on heavy-tail modeling for time series and the analysis of their extremes.

Over the last 10–15 years, research in extreme value theory has focused on the interplay between the serial extremal dependence structure and the tails of time series. In this context, heavy-tailed time series (as appearing in finance, climate research, hydrology, and telecommunications) have been studied in detail, leading to an elegant probabilistic theory and statistical applications.

Heavy tails of the finite-dimensional distributions are well described by multivariate regular variation: it combines power-law tails of the marginal distributions and a flexible dependence structure which describes the directions at which extremes are most likely to occur; see Resnick (2007) for an introductory text to multivariate regular variation.

A second line of research has continued through the years but attracted less attention: heavy-tail large deviations. In the 1960s and 1970s A.V. and S.V. Nagaev started studying the probability of the rare event that a random walk with iid heavy-tailed step sizes would exceed a very high threshold far beyond the normalization prescribed by the central limit theorem. In the case of subexponential (in particular regularly varying) distributions the tail of the random walk above high thresholds is essentially determined by the maximum step size. Later, related results were derived for time series models by Davis and Hsing (1995), Mikosch and Wintenberger (2014, 2016), among others. Here, the main difficulty is to take into account clustering effects of the random walk above high thresholds.

Regular variation and heavy-tail large deviations are two aspects of dependence modeling in an extreme world. They are similar in the sense that they are closely related to the weak convergence of suitable point processes. Actually, both regular variation and heavy-tail large deviations are defined via the vague convergence of suitably scaled probability measures whose (infinite) limit measure has interpretation as the intensity measure of a Poisson process. In the heavy-tailed time series world this relationship opens the door to the Poisson approximation of extreme objects such as the upper order statistics of a univariate sample, the largest eigenvalues of the sample covariance matrix of a very high-dimensional time series, and to functionals acting on them.


[1] Davis, R. A. and Hsing, T. (1995) Point process and partial sum convergence for weakly dependent random variables with infinite variance. Ann. Prob., 23, 879–917.

[2] Mikosch, T. and Wintenberger, O. (2014) The cluster index of regularly varying sequences with applications to limit theory for functions of multivariate Markov chains. Probab. Theory Rel. Fields, 159, 157–196.

[3] Mikosch, T. and Wintenberger, O. (2016) A large deviations approach to limit theory for heavy-tailed time series. Probab. Theory Rel. Fields, 166, 233–269.

[4] Nagaev, A. V. (1969) Integral limit theorems for large deviations when Cramér’s condition is not fulfilled I, II. Theory Probab. Appl., 14, 51–64 and 193–208.

[5] Nagaev, S. V. (1979) Large deviations of sums of independent random variables. Ann. Probab., 7, 745–789.

[6] Resnick, S. I. (2007) Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York.