Of course, there were other factors too: high turnout in rural areas in rust belt states, undecided voters going into the polls largely broke for Trump, turnout/enthusiasm in some Democratic areas was down compared to 2012 and 2008, etc.. So, it is entirely mathematically sound for Silver to have made his predictions back in 2016 and today. Some would suggest that people responding to polls didn’t want to admit that they opposed Bradley, lest they seem like they opposed him due to his race, thus causing his support to seem inflated in polling. endobj Trump is not that different. Say you want to infer what percentage of American people want to vote for Trump. As long as your polls are unbiased and we can assume people won’t change their vote too much until the election (probably an easier assumption than unbiased), and you polled enough people, basic probability theory gives a good guarantee that basic polling will give a good estimate. The result? But we obviously don't have anything close to that (which we explain more later). Little unpredictable factors could result in dramatically different outcomes. What plagued many polls was probably an issue of weighting. Because of the radical uncertainty we're facing, shouldn't anyone's forecast model seriously adjust according to these factors beyond the election itself? But the true beauty of Bayesian inference doesn’t stop here – it is that you would keep updating your beliefs as you see more data. (1984). ), which can cause a less-probabilistic election outcome to occur. 17 0 obj 100! Nate Silver was NOT right – because he can never be wrong! There are currently many more algorithms available (and they are likely to grow). We're not sure, but most likely not. Chris Sims, Princeton University . Bayesian EstimationAim of week fourPrior distribution(s)Prior choice and specificationConsequences6. So here’s the dilemma proposed by Michael: You can either use all the previous election results data, which have less variance in your prediction, but your prediction may very likely be skewed because they don’t accurately represent Trump. X1! When we talk about statistical inference – the process that draws conclusions from sample data – two popular frameworks are the frequentist and Bayesian methods. In Bayesian statistics, you assign a probability distribution to all of your unknown parameters and predictions. Another way to estimate X is to go beyond polling, and perhaps use historical election data with some Bayesian inference method. If you see every American’s voting decision as a random variable, in total this could be a chaotic system. Questions? X2! When Tiger first told his parents that he’s writing a long article on the theories and applications of elections forecasting, they said: “nobody cares about your math; just tell us who the winner will be.” This is what millions of voters truly want – clarity, simplicity, and accuracy. endobj Furthermore, elections clearly aren’t a well-posed mathematical system. We hope this brief exploration below could be somewhat helpful in informing you of the foundational methodology that Silver uses to forecast. The issue is that the forecasters, through their complex probability models, have made this game easier for themselves. The random variable people really care about, let's call it X, is who is going to win the election, which is largely dependent on how many people vote for each candidate at some future date. This semester, Tiger, Jack, and Tom have been taking Princeton's 1st-year PhD econometrics sequence with Prof. Chris Sims, who won the Nobel Prize in Economics in 2011 for his work in macroeconomics – more specifically, his pathbreaking application of Bayesian inference to evaluate economic policies. Those against Silver, however, would argue that Silver's forecast was misleading, and expecting the public to understand the nuances of probability is unrealistic. When you do a Bayesian t-test instead of a frequentist one, the result you get is not a p-value but a number called a Bayes factor. Political scientists have utilized this type of effect to explain poll-result disparities before: in 1982, Democratic candidate Tom Bradley, a Black man, ran for governor of California; despite leading in the polls, he lost narrowly to the white Republican, George Deukmejian. Frequentists would say: I don't know what that percentage is, but I know that value is fixed, meaning that it is a number that is not random. This is something we cannot predict for sure, so a Bayesian would put some probability distribution for this number, and we might look at previous elections to come up with that probability distribution. All in all, it is much easier to look at a poll number and see if it’s a “good guess”. << /S /GoTo /D (subsection.3.1) >> Bayesian inference is one of the more controversial approaches to statistics. The wage data should now be logged, to make interpretation of the regression easier, and … Good Bayesian analyses consider a wide range of models that vary in assumptions and flexibility in order to see how they affect substantive results. Every election, he just says that the Republican candidate has a 50% chance of winning, and the Democratic candidate has a 50% chance of winning. For any pollster or election forecaster to model these events would mean incurring serious risks to their reputations. This allows him to always explain in hindsight whether that’s in fact a really good or bad number. endobj Clinton support was overstated. They say since it’s a fixed and known value, there’s no point of giving it a probability. But we think there’s an even more philosophical and deeper argument to be made here, which is that Nate Silver cannot really be right or wrong when there’s no strict standard to judge him. This was confirmed in thesis work carried out at the University of Minnesota by Robert Litterman under Sims’ direction (see Litterman, 1979, 1986a, 1986b). But instead, he gives a probability like 16% (which few people understand the true meaning and calculation behind it). 13 0 obj An intuitive way to get an estimate on X is to estimate how many people are voting for each candidate right now. Chris Sims's Page Regimes, switching. Nate Silver was right – you just don’t understand statistics. Nate Silver is a Bayesian, and his forecasting isn’t just popular amongst the public, but also highly regarded by many seasoned econometricians we’ve talked to. Christopher David Simms (born August 29, 1980) is a former American football quarterback who played in the National Football League (NFL). Will the announcement of 7.4% record-level GDP growth one week before the election sway voters? Our co-author Michael is a math major at Princeton, and those who have contributed to this article through comments and informal conversations include professors and graduate students in economics, mathematics, and political science. (This part is Michael trying to show off his physics knowledge). I just love this piece by Chris Sims: "Bayesian Methods in Applied Econometrics, or, Why Econometrics Should Always and Everywhere Be Bayesian", from 2007. Do we have enough data to predict someone like Trump? This approach was further articulated and extended in a widely cited paper by Doanet al. Brilliant minds by these "revisionist statisticians.". There are principled, practical procedures for doing this. 0! If the latter is the case, then why does it really matter what polls say, and how can they even be useful for prediction? But with Bayesian statistics, you can actually find evidence for the null. For example addpath c:/dynare/4.0.1/matlab Introduction to Bayesian estimation Uncertainty and a priori knowledge about the model and its (What it is) The fact that forecasting has become so complex that it would take us pages to explore even the most fundamental concepts only shows the progress made by political scientists, but also the unnecessary over-complication of simple ideas. The very last poll from PA showed Trump with a lead, but most others showed the Democratic nominee with a slight advantage. Thus we can explicitly exploit the factor structure of the data and the law of motion of the extracted factors. Silver is only looking at the voter sentiments as they are and then making predictions based on these data, rather than incorporating possibilities like a coup. Chris Sims at Princeton has written extensively on this point over the last 15+ years: basically a flat / non-informative prior does not make sense if you expect that there are dynanics because you need to deal with the fact that the ML model addresses the conditional … For DSGE models, the library can solve models using Harald Uhlig's method of undetermined coefficients and Chris Sims' canonical decomposition; (Recent Successes) One of the objective things that Bayesian inference theory shows is how people update beliefs. So what can we conclude for this year? For example, if one believes that climate change isn't due to human factors, then the effect of new information on this person's posterior may heavily depend on whether it agrees with the existing prior – a fact of CO2 emissions will influence the person very little, while some fact about the "unpredictability of weather" may deeply reinforce this person's conviction that climate change is not due to human actions. BMR can estimate BVARs with time-varying parameters, as well as classical (non-Bayesian) VARs. The question remains: how do you call out Crackhead Jim for the fraud he is? É It is based on a derivative-based minimization routine. << /S /GoTo /D (section.1) >> You don't need to believe there's a fixed truth; you just need to be willing to update your beliefs, and update them especially strongly when something unexpected happens. Sims (1980a) speculated that some sort of Bayesian approach might work better. In other words, theoretically, as long as they give some serious consideration to the other side's argument, they will eventually agree with each other. We should not forget that 16% is the probability of getting a six in a die roll (or any other number), which is actually quite high. Likewise, any verification of one's election prediction would involve having some reasonably good simulation of American voters, and we repeatedly run the simulation to see if Trump or Biden would win. Consider the example of Crackhead Jim. So, is Jim much better than Silver? A late poll in NY-22 showed Trump with a 12 point lead over Clinton, despite Obama tying Romney there in 2012 49-49. Well, the tricky thing is that you can never really test this hypothesis out unless you literally go out there and ask every single American. If the likelihood surface displays discontinuities it employs a simplex algorithm. For the estimation we choose a Bayesian likelihood-based estimation based on MCMC methods which is fully parametric. Nate Silver, widely considered as the preeminent pollster, uses Bayesian methods in his forecasting. The fact that we cannot make a judgement on who's right and who's wrong for a prediction of an election is in the same way that the physics community says that the famous "String Theory" cannot be right or wrong: there's no way to verify it. After de-scribing the solvers, we turn to Bayesian estimation using a state-space and filter approach, and posterior simulation using a Markov Chain Monte Carlo algorithm. And it can show evidence for your effect, evidence against your effect or it can say you don't have enough evidence to decide. Well, we first need to ask ourselves a question: what does it mean for a poll to be accurate? Has Nate Silver already done that in saying maybe Trump has an 8% likelihood of winning because of that and therefore here's the total probability of him winning the actual election? So, were the polls wrong in 2016? Also, Trump may not be that "ground-breaking" anyways as many previous presidents like Nixon had their unique ways of appealing to their bases and upsetting conventional wisdom back then. So, the people who saw a 16% likelihood as "oh Trump is definitely going to lose" just simply didn't understand statistics. The only way for us to measure the consistency of our data is for the election to happen. Bayesian methods are good for combining information from different kinds of sensors (sensor fusion). The variance for this prediction is way too high, and it’s hard to say.