A Commentary on “The Kids Are Alright: Divide by n when estimating variance,” by Jeffrey S. Rosenthal, IMS Bulletin (December 2015), Vol. 44, No. 8, Page 9

Dear Editor

Professor Rosenthal’s piece is persuasive and very clearly written. I thank Professor Rosenthal for taking us back to this old concern that never truly goes away. Indeed the basic issue under consideration appears and reappears when one teaches a cohort of new students.

With nearly 40 years of teaching experience now, I have a different, but easy, way to explain why the divisor in the customary sample variance is suddenly $n − 1$ instead of $n$. It is my understanding that there are readers out there who may happen to like my simple persuasion, below, in favor of a traditional divisor $n − 1$.

Suppose that I have $n$ random samples $X_1, \cdots, X_n$ from a single population with a population mean $\mu$. Customarily, in many elementary courses, I propose that $\mu$ is estimated by the sample mean, $\bar{X} = \frac{1}{n}\sum_{i=1}^{n} X_i$. Here, the divisor is $n$ and no one really objects to that idea.

Then comes the idea of variation around $\mu$. First, I explain why no-one considers $E[X−\mu]$ as a quantification of variation. An explanation is simple: $E[X−\mu] = 0$ under the population distribution. In other words, the errors in over-estimation and under-estimation of $\mu$ by $\bar{X}$ cancel out.

Thus, many proceed to the next step.

Define a population variation or variance as $\sigma^2$ given by $E[(X−\mu)^2]$, which will be positive unless all observations coincide with $\mu$ (with probability 1). After all, who wants to collect data where every data point is the same, and waste time and money!

So, how should one estimate $\sigma^2$? Well, I begin with $\sum_{i=1}^{n}(X_i – \bar{X})^2$. But I note that $\sum_{i=1}^{n}(X_i – \bar{X})$ is identically zero for any set of $n$ numbers. That is, among $n$ numbers (residuals) $X_1 − \bar{X}, X_2 − \bar{X}, … , X_n − \bar{X}$, we have exactly $n − 1$ free-riding numbers, since all $n$ residuals add up to zero. That is, the remaining $n$th number is fully determined by the other $n − 1$ free-riding numbers. Thus, while one obtains the sample variance, one divides $\sum_{i=1}^{n} (X_i – \bar{X})^2$ by $(n-1)$ instead of $n$. In this sense, $n – 1$ is customarily called the “degree of freedom,” that is, an indication of how many among $n$ residuals are truly free-riding.
In a first-year pre-calculus course that is often mandatory for all (or a large majority of) undergraduate students, the idea of pursuing mean square criterion (MSE) considerations never really convinces our first-year undergraduates since they had never heard of MSE prior to taking Stat 100 or Stat 110.

Especially for them, in order to have a painless discourse, I take a very small set of numbers, say, 3, 4, 2, 4, 2 with $n$ = 5. Obviously, $\bar{x} = 3$ and

$\sum_{i=1}^{n}(x_i – \bar{x}) = 0 + 1 – 1 + 1 – 1 = 0$

but

$\sum_{i=1}^{n}(x_i – \bar{x})^2 = 0 + 1 + 1 + 1 + 1 = 4.$

Thus, the sample variance should be the customary

$s^2 = \frac{1}{4} \sum_{i=1}^{n} (x_i – \bar{x})^2 = 1$.

The divisor is 4 instead of 5 because 4 is the “degree of freedom” as explained.

Nitis Mukhopadhyay
Professor of Statistics
University of Connecticut, Storrs, USA

Have you got something to say about statistics, probability, or maybe something you’ve read in the Bulletin? Send your letter to the Editor to bulletin@imstat.org.