Institute of Mathematical Statistics | Anirban’s Angle: Midterm Mathematical Musings

Anirban’s Angle: Midterm Mathematical Musings

October 1, 2022

Armchair expert Anirban DasGupta has been thinking about political predictions in the US midterm elections. He writes:

With the midterm elections in the US just around the corner and the next general election not too far away, political analysts and also common people are thinking about which factors influence the outcome of an election. There have been numerous studies, some formal and others informal, on this matter. I know I am not writing here about an original question, but maybe I could say something that the Bulletin’s readers didn’t know.

Let me list a few factors that many others have mentioned as possibly influential: consumer confidence, unemployment rate, gas prices, inflation, the Dow Jones, cost of healthcare, a feel-good index, current involvement in a war, domestic crime rate, Congressional performance, scandals, and the sitting President’s charisma and popularity. Of course, there are many more. But somewhat arbitrarily, I decided to name 12 factors, and these 12.

Naturally, I looked at recent data. The party in the White House lost the following number of seats in the US House in the last 10 midterm elections: 26 (1982), 5 (1986), 8 (1990), 54 (1994), 4 gain (1998), 8 gain (2002), 32 (2006), 63 (2010), 13 (2014), 40 (2018). The sitting presidents at the time of these midterm elections were Ronald Reagan (1982 and 1986), George H.W. Bush (1990), Bill Clinton (1994 and 1998), George W. Bush (2002 and 2006), Barack Obama (2010 and 2014), and Donald Trump (2018). Thus, the conditional empirical expectation of the number of House seats lost given that the sitting President was Republican is 17.1. and the same conditional expectation for a Democrat sitting President is 31.5. The difference is intriguing. It would be interesting to explain that very substantial difference. The four most prominent outliers in these data are 54 losses in 1994, 63 losses in 2010, 40 losses in 2018, and 8 gain in 2002. Going back, Bill Clinton was not yet a popular President in 1994, perhaps Barack Obama had high negatives among a significant portion of the Americans in 2010, perhaps there was still a rallying-behind-
the-President national sentiment in 2002 after 9/11, and as regards the 40 losses in 2018, we are probably still studying the national perception of Donald Trump’s Presidency.

The lag-one correlation is only -0.017. I was surprised to see the lag-two correlation, a whopping −0.847. What explains that?

Contrary to my expectation, I found very little correlation between the number of seats lost by the party in the White House and inflation just before the election. The correlation with inflation was 0.027. What that probably means is that the true correlation is virtually zero and the calculated value is pure noise. Now as for gas prices, I was stunned to see that the correlation with number of seats lost is −0.71. The wrong sign goes completely against common sense. Voters were being influenced by something that looms larger than prices at the pump. As an example, in the 2002 elections, during George W. Bush’s presidency, the Republicans actually gained 8 seats, although gas price was the highest in the last 40 years in 2002 (adjusted to 2022 dollars).

A plausible bigger factor explaining this is that on the eve of the 2002 Presidential election, President Bush was sitting at 63% approval rating, and it may have made everything else irrelevant. I also calculated the correlation of number of seats lost with the President’s approval rating on the eve of the election. It was −0.84. This one is very high and the negative correlation is in the right direction.

Elections in any country are notoriously difficult to predict, even when polling data are used. (We didn’t use polling data here at all. We don’t have much at the micro scale at all right now. Also, we can leave that to the professionals who will be doing it.) It does look like predictions at a macro level could be made by using a regression model with unemployment rate, some sort of a quality-
of-life index, whether the country is currently involved in a war, health care cost and Presidential approval rating, with an autoregressive error of order two (maybe). That’s quite a few variables. So, we may need to use data further back than 1982 to have enough degrees of freedom to estimate the variances.

Very (very!) naively, if we fit a standard linear model of the number of House seats lost by the President’s party on just the President’s approval rating X₁, and the unemployment rate X₂, then we get a least squares fitted line y = 89 − 1.6 X₁ + 2.55 X₂, and it being the current data that x₁ = 44 and x₂ = 3.9 (i.e., 44% approval on September 13 and 3.9% unemployment on February 28), the naive predictor forecasts that Democrats will lose 28 seats this November.

Now, let’s see what happens on November 8. I will be ready with my biryani, pakoras, gulab jamun and Darjeeling tea.