Saturday, 25 June 2016

Did areas with the fewest immigrants vote for brexit?

Well, as the Germans say, "jein". (By the way, wouldn't it be great if we could declare that as the referendum result?)

Here's a plot of areas' vote for Leave, against their international immigration net inflows 2004-14. I've added a line from a linear regression. Yep, there's a clear negative slope.

But here's the same data with a local smoother, which lets the relationship vary at different parts of the data.


This tells a different story. For low levels of migration of up to 5% – which includes more than 80% of all local authorities – vote to leave is flat or increasing with migration. Then there is a long tail of local authorities with very high levels of immigration, and here the leave vote declines with migration.

Running linear regressions confirms the story. When you exclude the influential observations with really really high migration, the correlation of migration flows and vote to leave becomes significantly positive.

Regressions of vote to leave on migration inflows
All Local Authorities Local Authorities with < 5% inflows
(Intercept) 55.27*** 53.65***
(0.59) (0.65)
inflow_int_pct -0.62*** 0.98**
(0.09) (0.36)
R2 0.12 0.02
Adj. R2 0.11 0.02
Num. obs. 380 299
RMSE 9.80 8.82
***p < 0.001, **p < 0.01, *p < 0.05

Moral: beware of superficially convincing statistics.

Better still, beware of self-righteous memes, which make us feel better about losing an argument, by telling us that our opponents were fools.

Update: Chris Hanretty said, how about a Scotland dummy? And he should know, so I threw one in. Then I threw in all the regions, because of omitted variable bias and what the heck. This is still only local authorities with < 5% inflows:

Regressions of vote to leave on migration inflows
Scotland dummy Regional dummies
(Intercept) 55.87*** 56.95***
(0.60) (1.16)
inflow_int_pct 0.31 0.59*
(0.31) (0.29)
I(Region == "Scotland")TRUE -16.10***
RegionEast Midlands 1.95
RegionLondon -14.37***
RegionNorth East 2.35
RegionNorth West -0.45
RegionScotland -17.24***
RegionSouth East -5.13***
RegionSouth West -2.70
RegionWales -3.53
RegionWest Midlands 2.79
RegionYorkshire and The Humber 2.48
R2 0.28 0.43
Adj. R2 0.28 0.41
Num. obs. 299 299
RMSE 7.58 6.86
***p < 0.001, **p < 0.01, *p < 0.05

Results are a bit wobbly but still nothing that looks like a negative slope. Your mileage may vary.  All goods subject to status. Objects in the rear view mirror may look like the European Union.

Second update. 
Someone on twitter said that my second picture above "focuses on tiny variations in the middle but ignores the vast trend". This is an understandable mistake and I worried that someone might make it. The 'tiny variations in the middle', where the trendline goes up, look small on the graph, but that is where 80% of the data is. To clarify, here is the same data, divided into deciles. Each column shows 1/10 of the data, from the 10% of areas with the least migrant inflow, to the 10% of areas with the most. This is hump shaped, as I said, and the only decline is in the top two deciles. (But to be fair, the only increase is in the bottom two deciles, so maybe my regression above is also a bit misleading; there's no reason to lump the middle of the data either with the bottom or the top.)

One obvious point to make is that in the top two deciles, migrants with UK citizenship may have voted to remain. So I don't think this data proves much about "exposure to immigration"; we have to be careful of the ecological fallacy here, i.e. of inferring individual attitudes from aggregate data.