Anirban DasGupta writes:

Sreekumar Nambiar and I graduated the same year from the ISI in Calcutta. A few months ago, as we were shooting the breeze, Sreekumar asked if tea is more of a common man’s drink than coffee. I didn’t know. But it reignited in me that folklore that tea and coffee both give some people some protection against some diseases. Naturally, I did not have easy access to controlled worldwide experiments on this matter. However, with some patience and effort, I could get metadata on it sitting at my desk. I decided to look at rates of disease incidence for heart disease and cancer of all types, and data on consumption per capita of tea, coffee, cigarettes, alcohol of all types, and GDP per capita for 40 countries scattered across the world. I was curious how consumption of tea and coffee relate to disease incidence. Do the metadata suggest any links? Of course, no causation is implied. But at some level, I was in for at least a mild dose of surprise. I thought I would share it with you.

I want to emphasize that metadata wouldn’t be as informative as micro-level data from controlled experiments, but metadata isn’t useless. Second, no applied statistics methodology ever goes unquestioned. For instance, I don’t look at partial correlations here. And third, the topic is worth a peer reviewed full length journal article.

A (non-alphabetical) list of the 40 countries I considered is: South Korea, Australia, Hungary, Croatia, Russia, Poland, Rwanda, Denmark, Kenya, The Netherlands, China, Sudan, Turkey, Syria, Cuba, Myanmar, South Africa, France, Ireland, Greece, UK, Colombia, Argentina, Germany, Chile, Egypt, Jamaica, USA, New Zealand, Italy, Austria, Canada, Israel, Brazil, Sweden, Nigeria, Japan, Iran, India and Mexico.

I looked at these seven variables: Death rates (per 100, 000) from [1] any type of cancer, from [2] heart disease, and consumption per capita per year of [3] cigarettes, [4] any type of alcohol (litres), [5] coffee and [6] tea (kg), and [7] per capita GDP in thousands of dollars.

I calculated the (albeit unsophisticated) correlation matrix:
$R = \left( {\begin{array}{ccccccc}
1 & .193 & .376 & .247 & -.136 & .028 & -.033 \\
& 1 & .248 & -.354 & -.484 & .040 & -.482 \\
& & 1 & .146 & .222 & .005 & .376 \\
& & & 1 & .354 & -.164 & .510 \\
&&&& 1 & -.313 & .712 \\
&&&&& 1 & -.080 \\
&&&&&& 1 \\
\end{array}} \right).$

What struck me are the correlations between coffee and tea consumption and death rates from heart disease: −.484 for coffee, and −.354 for tea. In countries with higher coffee consumption, death rates from heart disease appear to be somewhat lower. Turning to cancers, the correlation with coffee drinking is −.136, and that with tea is .028. Neither of the two appears to lower deaths from cancers, even more so for tea. But, as one may surmise, cigarette smoking is visibly positively correlated with deaths from cancers.

Now let us turn to alcohol and cigarettes against GDP. The correlation of GDP with alcohol is .510 and with cigarette consumption is .376. In richer countries, there is more alcohol consumption and more smoking. And to come back to my friend Sreekumar’s question, whether coffee is for the elite, look at that .712 correlation between GDP and coffee consumption!

At some intuitive level, many of these correlations make sense, although not the exact values. But still this diminutive statistical adventure did spring a surprise or two on me when I weighed coffee against tea and saw the prowling silhouette of the villain that we were all told smoking is.

Now, where is my Gevalia?

To give an overview, here is that data for 20 of the 40 countries: