In the wake of the surprising (to say the least) electoral result I initiated a few projects to try and understand the politics of the country. One thing I wanted to understand was the impact of demographic and underlying situational variables (e.g. health, income, unemployment, etc.) on how people voted. Was the vote about Obamacare? Was it about lost jobs? Was it all education levels? Or was it all racism? Theories have been floated but I haven't seen a rigorous evaluation of these hypotheses. What's below is just an exploratory analysis, but the data does point in some interesting directions.
What follows below are a series of visualizations of a large, aggregated dataset of both demographic, situational, and electoral data. Sources for the demographic and situational data are listed here and the electoral data is from the New York Times.
The type of visualization is called a self organized map. Roughly speaking, each hexagon is a group of counties; the map arranges the counties such that similar ones are closer to each other on the map and dissimilar ones further apart:
For any given variable (here - the proportion of residents in the counties that graduated from high school) the map is a heatmap. Redder colors means the counties index higher, bluer means they index lower. Here, the upper right are the counties where fewer people have a high school diploma, and lower left are the most educated.
Below, we look at the voting share for Hillary. The counties are arranged in the same way as above, but since we're looking at different variable the map is colored differently. (Confusingly, more votes for HRC are red as opposed to the customary blue for liberals, but work with me here). The reason this map is more organized than the rest is that I used this variable to "supervise" the organization (don't worry about the details of this - basically it just guaranteed that this particular coloring, which is the reference point, would be organized.)
Now that we have the basics in place, we can look at other variables: let's check a few variables and see if they line up w/ the HRC voting share map. What we can do is draw a boundary around the areas that went strongly for Trump and for HRC like so:
And we'll keep these annotations throughout.
Health and insurance:
The breakdowns for uninsurance and health variables like obesity and diabetes don't break down along electoral lines: the split goes in the opposite direction, with the highest uninsurance and low health areas going to both candidates:
These graphs should put the "economic anxiety" argument to rest, as the areas with highest unemployment went strongest to HRC and those with the least went strongest to Trump.
A few graphs line up pretty well: whiteness and ethnic homogeneity. And whiteness and ethnic homogeneity line up basically on top of each other. This would support the hypothesis that the election for Trump was mainly a cultural (and not a policy) event; white enclaves are reacting against a diminishing place in the cultural landscape - hence the making things great again:
See for yourself:
My code is available here (it is not very well commented or formatted, but it's there).
As mentioned above, all the maps above are available at this site. If you want to see something similar but with voting swings - the amount the county changed their vote from '12 to '16, you can see that here.