(And I didn’t even know this until yesterday.)
Apparently the NFC has won the coin toss in all of the last 14 Super Bowls. As Sean Carroll points out, there’s a 1 in 8192 chance of 14 coin flips all coming out the same way, which via the usual recipe translates into a 3.8 sigma result. In the standard conventions of particle physics, you could get that published as “evidence for” the coin being unfair, but not as a “detection” of unfairness. (“Detection” generally means 5 sigmas. If I’ve done the math correctly, that requires 22 coin flips.)
But this fact isn’t really as surprising at that 1/8192 figure makes it sound. The problem is that we notice when strange things happen but not when they don’t. There’s a pretty long list of equally strange coin-flip coincidences that could have happened but didn’t:
- The coin comes up heads (or tails) every time
- The team that calls the toss wins (or loses) every time
- The team that wins the toss wins (or loses) the game every time
etc. (Yes, the last one’s not entirely fair: presumably winning the toss confers some small advantage, so you wouldn’t expect 50-50 probabilities. But the advantage is surely small, and I doubt it’d be big enough to have a dramatic effect over a mere 14 flips.)
So the probability of some anomaly happening is many times more than 1/8192.
Incidentally, this sort of problem is at the heart of one of my current research interests. The question is how — or indeed whether — to explain certain “anomalies” that people have noticed in maps of the cosmic microwave background radiation. The problem is that there’s no good way of telling whether these anomalies need explanation: they could just be chance occurrences that our brains, which have evolved to notice patterns, picked up on. Just think of all the anomalies that could have occurred in the maps but didn’t!
The 14-in-a-row statistic is problematic in another way: it involves going back into the past for precisely 14 years and then stopping. The decision to look at 14 years instead of some other number was made precisely because it yielded a surprising low-probability result. This sort of procedure can lead to very misleading conclusions. It’s more interesting to look at the whole history of the Super Bowl coin toss.
According to one Web site, the NFC has one 31 out of 45 tosses. (I’m too lazy to confirm this myself. I found lists of which team has won the toss over the years, but my ignorance of football prevents me from making use of these lists: I don’t know from memory which teams are NFC and which are AFC, and I didn’t feel like looking them all up.) That imbalance isn’t as unlikely as 14 in a row: you’d expect an imbalance at least this severe about 1.6% of the time. But that’s well below the 5% p-value that people often use to delimit “statistical significance.” So if you believe all of those newspaper articles that report a statistically significant benefit to such-and-such a medicine, you should believe that the Super Bowl coin toss is rigged.
Contrapositively (people always say “conversely” here, but Mr. Jorgensen, my high school math teacher, would never let me get away with such an error), if you don’t believe that the Super Bowl coin toss is rigged, you should be similarly skeptical about news reports urging you to take resveratrol, or anti-oxidants, or whatever everyone’s talking about these days. (Unless you’re the Tin Man — he definitely should have listened to the advice about anti-oxidants.)