Institute of Mathematical Statistics | The XL-Files: From t to T

July 17, 2013

Contributing Editor Xiao-Li Meng writes:

Giving a banquet speech after a whole conference day is never easy—few people long for yet another speech. My recent speech also faced an extra challenge. The speech was scheduled after as many karaoke performances as the number of Chinese dishes, except that the former had a much higher variance in quality than the latter. To an audience already saturated with “noise,” delivering a speech with sufficient “signal” requires both an attractive topic and an engaging style. Whereas the latter should be expected of any banquet speaker, I apparently had chosen a tough—and mysterious—topic for this occasion.

“What is t?” That, evidently, was an easy question for a room of statisticians. “But what’s so special about the t-statistic?” Far fewer raised their hands. Although knowing the t-statistic is a minimal requirement for statisticians, not all of us are taught to fully appreciate the conceptual quantum leap made by the discovery of Student’s t more than a century ago. It demonstrated the possibility of moving from “unknowable unknown” to “knowable unknown.” If no inferential pivotal quantities such as t existed, our inference of a normal mean would have to depend on the unknown variance. This dependence then requires a second inference for the variance, which in turn would need a third inference for a similar reason, leading us into an “infinite regress” trap.

Perhaps a reasonable analogy is to consider that in order to know which rank list (for example, who is the most opinionated statistician) to trust, we need to know which ranker is the most trustworthy. This would then require a rank list of the rankers. But then we need to know how trustworthy is this ranker of the rankers, leading to a “Catch 22” situation (at least, in theory).

The availability of pivotal quantities such as the t-statistic eliminates the infinite regress, permitting us to make precise probabilistic and completely knowable (e.g., verifiable via simulation) statements about the unknown.

Hold on to your bewilderment about how could this be anything but an “attractive topic,” because I was not yet done generating more bewilderment for my audience!

“So what’s so special about the Behrens–Fisher problem for two-sample t-test?” Silence settled in the room.

Time for a story then, now that I had the audience’s undivided attention. Years ago I was involved in interviewing a candidate for a lectureship, and we asked him to teach the two-sample t-test. After his sample lecture, I asked the following question: “Why do we still need the t-approximation or normal approximation, when we now can simulate almost any distribution?”

One audience member started to smile, as she saw this was a trap. The candidate did fall into the trap, by answering, “Oh, we need them just for convenience.” The real answer is far deeper. For the Behrens–Fisher problem, there is actually no mathematically meaningful target distribution to be approximated (at least not in the classical sense as with the t-statistic). Yet the usual two-sample t-approximation permits us to achieve “approximately knowable unknown” when it is impossible to achieve “knowable unknown” due to lack of inferential pivotal quantities.

Of course that would be a dreadful punch line for any banquet speech! But it was a good moment to perplex the audience by giving them an even bigger puzzle: “So, what is T?”

What followed was perhaps the most active audience participation I have ever experienced as a speaker. The answers varied from many expected ones such as “Hotelling’s T” and “Teaching” to cute ones such as “Teasing” (and I was!) and creative ones such as, “‘t to T’ is ‘time to Tenure’!” (this answer won the book that I described in April’s X-L Files, with which I bribed the audience for their attention).

The answer I intended, which you might have guessed if you read the last X-L Files, is just that T represents a T-shaped education.

“Much of the discussion we just had about the t-statistic is an example of depth of scholarship. Although the t-statistic is used every day by so many, those who appreciate its deep and broad implications are far fewer.” Then my punch line: “But depth of scholarship and expertise is only the vertical stroke of the T-shaped education. We also need the horizontal stroke, representing a broad set of knowledge and skills, especially communication skills.”

I then introduced the audience to the Harvard Horizons initiative launched by our graduate school this past May. This program selected 8 PhD candidates across our 57 degree programs, based solely on their research accomplishment. They then went through six weeks of professional training on how to deliver a TED-like talk, but in 5 minutes, to a general audience about their research. How did they do? Check out http://www.gsas.harvard.edu/harvardhorizons.

And how did I do as the banquet speaker? Did I get good laughs or was I laughed at? Well, it is “time to Tease”: come to my next banquet speech to collect your own data!

0 Comments

Comments on “The XL-Files: From t to T”

Leave a Reply Cancel reply