Science isn’t about predicting one-off events like election results, but that doesn’t mean the way to make election forecasts scientific (which they should be) is to build “theories of voting.” A number of people have sent me articles on statistical aspects of the recent U.S. election, but I don’t have much to say and I like to keep my blog non-political. I won’t violate this rule in making a couple of comments on Faye Flam’s Nov. 11 article: “Why Science Couldn’t Predict a Trump Presidency”[i].
For many people, Donald Trump’s surprise election victory was a jolt to very idea that humans are rational creatures. It tore away the comfort of believing that science has rendered our world predictable. The upset led two New York Times reporters to question whether data science could be trusted in medicine and business. A Guardian columnist declared that big data works for physics but breaks down in the realm of human behavior.
But the unexpected result wasn’t a failure of science. Yes, there were multiple, confident forecasts of win for Clinton, but those emerged from a process doesn’t qualify as science. And while social scientists weren’t equipped to see a Trump win coming, they have started to test theories of voting behavior that could shed light on why it happened…..
Not that these methods are pseudoscience; in fact, they employ some critical tools of science. The most prominent among those is Bayesian statistics, a way of calculating the probability that something is true or will come true.
Bayesian analysis is a core principle laid out in political forecaster Nate Silver’s book “The Signal and the Noise.” Though developed in the 1700s, Bayesian statistics had a resurgence in the science of early 21stcentury. …
Why don’t Bayesian statistics work the same sort of consistent magic for political forecasts? In science, what matters isn’t the forecast but the nature of the models. Scientists are after explicit rules, patterns and insights that explain how the world works. Those give other scientists something to build on — allowing science to self-correct in a way that other intellectual ventures can’t…..
Now that it’s over, there’s still a chance for science to explain why so many people voted for Trump. There are all kinds of guesses and judgments being thrown around about Trump voters — that they’re racist or sexist, or responding to the call of tribalism. Those aren’t the least bit scientific, but they could be turned into testable hypotheses.
You can read the rest of her article here.
Anytime a purportedly scientific method fails, a defender can always maintain the failures weren’t really scientific applications of the method. I think we did see a failure of many of the polling methods as the basis for the best-known forecasts. Methods used by Trump’s internal polling alerted them to what was happening in the “rust belt states” (according to campaign manager and pollster Kellyanne Conway), but the other polls largely missed it. They didn’t really share those internal results, and the attention Trump gave to typically blue states perplexed many [For some other activities kept under wraps, see ii]. Bill Clinton, on the other hand, “had pleaded with Robby Mook, Mrs. Clinton’s campaign manager, to do more outreach with working-class white and rural voters. But his advice fell on deaf ears.” (Link is here.)
Flam suggests, on the basis of her interviews with social scientists, that the way to turn forecasts into science is to build theories of voting. My guess is that’s the wrong way to go (I don’t claim any expertise here.) It’s an understanding of the threats to the assumptions in the particular case, with all its idiosyncrasies, that’s called for. The only thing general might be the ways you can go wildly wrong. Do their theories include tunnel vision by pollsters? Perhaps they should have asked: “If you were a person planning to vote for Trump, would you be reluctant to tell me, if I asked?”[iii]. In Trump’s internal polling, they would deliberately ask a number of related questions to ferret out the truth. Of course pollsters are well aware of the “undercover” or “shy” voter who is too worried about giving an unacceptable answer to be frank. If there was ever a case where this would be likely, it’s this—yet it was downplayed. Ironically, one might expect the more the “shy” voter should have been a concern, the less seriously a pollster would take it. (You can ponder why I say this.) It’s not enough to have a repertoire of errors if they’re not taken seriously.
As for the “consistent magic” of Bayesianism, since in this case we’re talking about an event, frequentists, error statisticians, and Bayesians can talk Bayesianly if they so choose, but my understanding is that most polling is in the form of frequentist interval estimates (perhaps with various weights attached). Maybe, as Flam suggests, some formally combine prior beliefs with the statistical data, but that’s all the more reason to have been ultra-self-critical and probe how capable the method is at disinterring fundamental flaws in the model. They should have been giving their assumptions a hard time, bending over backwards to disinter biases, and self-sealing fallacies, not baking them into the analysis.
Share your thoughts.
[i] Flam is the one who interviewed me, Gelman, Simonsohn, Senn and others for that NYT article on Bayesian and frequentist methods discussed on this post.
[ii] Apparently they also kept hidden in the “Trump bunker” a fairly extensive use of data analytics (on the order of $70 million a month, according to the Bloomberg article “Inside the Trump Bunker”), encouraging people to think it was a fledgling effort. Their polls were in sync with Nate Silver’s, they say, except for the time lag in Silver’s, owing to his reliance on other polls, but their inferences about what voters really thought differed.
Trump’s data scientists, including some from the London firm Cambridge Analytica who worked on the “Leave” side of the Brexit initiative, think they’ve identified a small, fluctuating group of people who are reluctant to admit their support for Trump and may be throwing off public polls. (Inside the Trump Bunker)
The article admits they also worked toward selectively depressing the vote. The overall data analytic project was to be the basis of an enterprise to pursue after a potential loss!
[iii] This is reminiscent of the question that permits you to get at the truth (about the correct road to town) when confronted with people who either always lie or always tell the truth.
In the last few days Trump went to the correct states that could be won. That is a vote that his internal polling was on target.
I suggest that the voting system that is run, I believe out of Los Vegas with a monetary prize and continuously updated would offer the best results currently available, A free input with no questions asked or interpreted.
Nate Silver did a pretty decent job – a lot of people were criticising him for giving Trump such a high chance (~30%) of winning.