Contributing Editor David Hand, Imperial College London, explains how a US state lottery was gamed:
We statisticians all know that buying lottery tickets is a fool’s game. Unless, that is, you regard the warm glow from dreaming about what you would do if you did win as worth the cost of a ticket. Short of fraud, it is not possible to change the odds that any one ticket will win. But you can change the chance that you will win by changing the number of tickets that you buy. Buy enough tickets—all possible combinations of lottery numbers—and you are guaranteed to win (a share of) the jackpot. Since, in most lotteries, the jackpot is rolled over to the next week if no-one wins, and since sometimes this can happen for many weeks in a row, the jackpot can build up to such huge sums that the expected winnings exceed the cost of the tickets.
This does require considerable financial resources and pretty impressive organisation—buying possibly many million tickets between two lottery draws—but several groups have attempted it, with some measure of success.
Other lotteries, however, have had a different system when the jackpot was not won. They accumulated the jackpot for a while, but when it exceeded a certain sum, they rolled it down and increased the sizes of the lesser prizes.
This was true of the Michigan Winfall game, for example. If the jackpot rolled up to over $5 million, the next draw would have no jackpot, but instead the money would be distributed across tickets matching fewer than six numbers. The brochure for the lottery helpfully gave the probabilities of three numbers winning a prize, four numbers winning, and five numbers winning. It was clear (at least to those who understood such things) that a roll-down meant the expected winnings exceeded the cost of the tickets. Indeed, on a roll-down, payouts for these lesser wins would be substantially greater than what they would normally have been without a roll-down.
This property was spotted by Jerry Selbee, a then recently retired convenience store owner, who took advantage of it. Starting small, he gradually built up an operation, creating a company, GS Investment Strategies LLC, with shares owned by family and friends, which won substantial sums of money. Good things don’t last forever, however, and in 2005 that lottery was shut down.
But then a similar lottery was launched in Massachusetts, the Cash Winfall, and Jerry organised a system to buy tickets for that. By 2009 he had won more than $20 million, making a profit of more than $5 million.
The roll-down properties of these lotteries is simple enough, and Jerry wasn’t the only one to spot it. Others did also, including a group of MIT students and a biomedical researcher at Boston University. The competition between these various groups meant that the winnings of each of them were smaller than they had hoped, so the MIT group devised a strategy to tackle this. Spotting a draw where the jackpot was too small to roll down, they suddenly bought $1.4 million worth of tickets, pushing it into a roll-down without giving the operators time to announce that one would take place, so that Jerry Selbee did not bother to buy tickets, and lost out on a substantial win.
You can read more about this, and how it all ended, at https://highline.huffingtonpost.com/articles/en/lotto-winners/.
From a statistical perspective, the important point here is that the roll-down lotteries had different structure from the rollover lotteries. To make valid inferences—or to take advantage of the structure—you need to be sure you understand that structure. This is just as true for statistical modelling as it is for lotteries.
A very familiar example is in a paired comparisons t-test. Ignoring the pairing and the correlation within pairs results in an overestimate of the between-group-mean variance, with a lower probability of detecting a genuine difference.
Another, less familiar, example is in supervised classification. Here, systems (e.g. in medical diagnosis, speech recognition, and so on) implicitly or explicitly compare an estimated probability with a threshold, assigning an object to one class if the estimate is above the threshold and to another class otherwise. But some situations have extra structure. In classifying human chromosomes, for example, rather than assigning each chromosome to a class individually, we can take advantage of the known fact that they come in twenty-two pairs, plus a couple of sex chromosomes. Looking at the overall distribution of chromosomes to classes results in superior classification to looking at them individually.
The important thing is to look at the big picture, ensuring that you capture all relevant aspects of the phenomenon being modelled. The better your model, the better your understanding and actions.