I recently finished reading The Signal of the Noise by prediction guru and stats wizard Nate Silver (here’s the book on Amazon, and here’s Silver’s FiveThirtyEight blog-cum-website). Silver is well known for his extremely accurate predictions and commentary related to US elections, but his knowledge of and interest in issues related to prediction range far and wide. His book deals with many subtleties associated with prediction. The book manages to go quite deep into the statistical issues without pulling any punches, yet manages to be broadly accessible to readers.
Silver does not discuss anything as radical as open borders, and in general, does not discuss normative questions at all, preferring to stick to his area of expertise: the accuracy and precision of predictions and forecasts and the problems associated with trying to make good predictions and forecasts. Nonetheless, my guess after reading Silver’s book is that he would be extremely skeptical of any claims regarding the effects of open borders, which are way “out of sample.” In particular, I’m guessing Silver would be unimpressed with claims that open borders would double world GDP. At any rate, reading Silver makes me more skeptical of claims made about the effects of open borders with allegedly high confidence. If you believe in Knightian uncertainty as a concept, you may well take the view that the uncertainty associated with open borders is Knightian in nature, and that most attempts at quantifying its impact are flawed. This might also explain why, even though there is a broad economist consensus supporting somewhat more open borders, few economists commit to going all the way to open borders. My co-blogger Nathan noted this explicitly in a comment on another blog post.
Even in areas where we are looking at “out of sample” predictions, however, all is not lost. One idea that Silver repeatedly reiterates throughout his book is that one should keep and use every piece of data. Judging the effects of open borders might be very difficult, and we may end up with a huge range (i.e., low precision). But we can still use some data points. The type of question that somebody like Silver, starting from the outside view, would ask is: “Of the people making predictions regarding the effects of changes in migration policy regimes, who has the better prediction track record?” Or “of the various methods used to predict the effects of changes in migration policy regimes, which methods have the better prediction track record?” Ideally, what we’d need to make this kind of judgment is:
- A large number of data points,
- all of which have outcomes that can be agreed upon clearly,
- with information about what prediction each side made prior to the event, and
- with information about what the outcome was.
Weather prediction is one such example. There are a large number of data points (the daily maximum and minimum temperature and precipitation statistics in many cities over half a century). The final value of each data point is broadly agreed upon, though there are measurement error issues. The values predicted by organizations such as the National Weather Service and Weather Channel are also available. All the conditions for an analysis are therefore available, and Silver in his book mentions one such analysis. The analysis finds that both the National Weather Service and the Weather Channel are fairly accurate, but that the Weather Channel (deliberately, it turns out), inflates the probability of precipitation on days when that probability is extremely low. This phenomenon is now known as wet bias.
Predictions in the political and economic realm don’t fare as well. There are a reasonably large number of data points regarding the outcomes of various electoral races, which satisfy the necessary conditions (lots of data points, clear outcomes, information about each side’s predictions, and information about the outcome) that allow us to get a sense of the quality of political predictions. The data isn’t as extensive as for weather, but it is still quite extensive. Silver finds that while predictions that relied on statistically valid polling techniques tended to do well, predictions made by political pundits on television didn’t. Silver finds a similar disappointing story of prediction when it comes to economic forecasting. He is also critical of people who make predictions and forecasts without specifying the margin of error or the distribution, but simply give a point estimate. In the discussion, Silver alluded to Tetlock’s study of prediction records and his distinction between “foxes” and “hedgehogs” (see here for an article co-authored by Tetlock with a summary of the idea).
When two sides are debating an issue and relying heavily on empirical claims about the future to make their respective cases, you’d naturally be curious about the prediction records of the two sides with respect to past predictions. There are two additional complications over and above the obvious measurement difficulties that apply particularly to political debates such as migration policy debates:
- The specific people engaging in the debate are usually different each time. Most pro-immigration groups and people around today weren’t there when the Immigration and Nationality Act of 1965 was passed. The same is true of the anti-immigration groups and people. Given this complication, each side can happily claim allegiance to the correct claims made historically by their side, and disown the incorrect claims as having been made by others they don’t support. This can be partly overcome by trying to come up with objective metrics of just how similar arguments offered today are to the failed arguments of the past, but there are many then versus now “outs” to deflect claims of objective similarity between the present and the past.
- Relatedly, it can be argued that proponents of an argument weren’t saying it because they actually believed it, but rather, they were just trying to rally public support to our cause, knowing that they would need to lie to (or at any rate, exaggerate their case to) a public that did not share their normative views. (I discussed incentives to lie about immigration enforcement in an earlier post).
Although these two difficulties present a challenge, there is probably much to be gained from a retrospective analysis of past changes in migration regimes and the predictions made by various people during those changes. Significant changes are better because (a) more people are likely to make explicit predictions of the effects of significant changes, and (b) the larger effect size makes it easy to determine what actually happened. Unfortunately, significant changes are also fewer in number, so we do not have the “large number of data points” that would allow for good calibration of the accuracy of predictions. But we’ve just got to deal with that uncertainty. It’s better than completely ignoring the past.
Relatedly, looking at migration regime changes sufficiently far back in the past also gives us some idea of the more long term effects of the changes. BK, one of the skeptics of open borders in our comments, has argued that the benefits of migration are front-loaded, while the costs take decades to unfold (see for instance here and here). Evaluating such concerns would require us to look at the long-term effects of past migration regime changes.
My co-blogger Chris Hendrix plans to begin a series that looks at various instances of open borders becoming more closed, along with the predictions and rationales offered at the time (expect to read Chris’s introductory post soon!). Later, one of us (perhaps Chris again, perhaps I, or perhaps one of our other bloggers) will be looking at instances of immigration liberalization and the predictions and arguments accompanying and opposing them. I’m particularly interested in the Immigration and Nationality Act of 1965 in the United States and the Rivers of Blood Speech by Enoch Powell in 1968 in the UK. The historical analysis will hopefully help us better calibrate the accuracy of predictions and forecasts about changes to migration regimes, hence better enabling us to evaluate the plausibility of claims such as “double world GDP” or end of poverty from the outside view.