Analyzing the 2016 elections
Elections are a treat for data enthusiasts. This elections, even more so ! Every single poll in the US predicted a landslide for Hillary Clinton. Below is an infographic from the New York Times:
The chances of Trump winning Presidency before results started coming out was around 15% ! Everyone got it wrong and in the end he won 306 seats. Being curious, I download the data from AP and turned on my Jupyter notebook. The truth is this could have been a lot worse!
The elections had two major candidates: Hillary Clinton from the Democratic Party and Donald Trump from the Republican Party. But in the fray were many other candidates, two of which took a good share (>1%) of the vote share: Jill Stein from the Green Party and Gary Johnson from the Libertarian Party. The Green party is more liberal than the Democratic; the Libertarians are more conservative than Republicans.
If we consolidate the votes of Hillary+Stein vs Gary+Donald i.e.: making a two candidate race, the Dems would have lost 5 more states == 33 more seats. They would lose in Colorado, Minnesota, New Hampshire, Nevada and Maine. Trump would have gotten 339 seats — a landslide. The map at the top shows how it would look like if that happened.
This is truly surprising. Living at MIT in Boston, I have so far spoken to 1 Trump supporter. It is like I so far lived in a bubble. In fact a lot of places have very few supporters of either party. From the data, here are the 10 places with the lowest vote percentage for Hillary and for Trump:
Boston was a close contender for Top 10. New York, SF and other urban centers are heavily Democratic which is probably what made us feel Hillary was going to sweep. In fact, when we check places with more than 0.5 million votes cast, Hillary wins 33 of them compared to just 3 for Trump. The only ones Trump wins are:
One positive effect of elections was that it showed that it is not always the person with more money who wins. From a previous /r/dataisbeautiful post:
The Associated Press releases data for the elections. While not free, I found that JSON endpoints powering their election interactive can be downloaded (it is public). All the data and the notebook with the code can be found here. Warn you the data is only for educational uses and may be used for commercial purposes only with AP’s permission.