Nate Silver describes “How we’re forecasting the primaries” using confidence intervals. Never mind that the estimates are a few weeks old, and put entirely to one side any predictions he makes or will make. I’m only interested in this one interpretive portion of the method, as Silver describes it:
In our interactive, you’ll see a bunch of funky-looking curves like the ones below for each candidate; they represent the model’s estimate of the possible distribution of his vote share. The red part of the curve represents a candidate’s 80 percent confidence interval. If the model is calibrated correctly, then he should finish within this range 80 percent of the time, above it 10 percent of the time, and below it 10 percent of the time. (My emphasis.)
OK. We look up the link to confidence interval.
How to Interpret Confidence Intervals
Suppose that a 90% confidence interval states that the population mean is greater than 100 and less than 200. How would you interpret this statement?
Some people think this means there is a 90% chance that the population mean falls between 100 and 200. This is incorrect. Like any population parameter, the population mean is a constant, not a random variable. It does not change. The probability that a constant falls within any given range is always 0.00 or 1.00.
The confidence level describes the uncertainty associated with a sampling method. Suppose we used the same sampling method to select different samples and to compute a different interval estimate for each sample. Some interval estimates would include the true population parameter and some would not. A 90% confidence level means that we would expect 90% of the interval estimates to include the population parameter; A 95% confidence level means that 95% of the intervals would include the parameter; and so on.
The bold portion of the definition alludes to what I call a rubbing off construal of a method’s error probability: If the particular inference—here an interval estimate—is a (relevant) instance of a method that is correct with probability (1 – α), then the (1 – α) “rubs off” on the particular estimate. Put to one side my qualification that it be a “relevant” instance. I’m wondering about what’s supposed to “rub off” according to Silver.
Following the definition in Silver’s glossary, it’s the degree of uncertainty that’s rubbing off. A more common construal is in terms of degree of “confidence”. (Should we prefer one to the other?) The main thing is that the probability characterizes the performance of the estimation method, but the result of rubbing off is to assign the particular estimate, not a probability, but something else, be it “uncertainty” or “confidence”.
The severity construal I recommend differs somewhat from”rubbing off” interpretations, but never mind that .What I’m wondering is how Silver’s glossary definition underwrites his claim:
If the model is calibrated correctly , then he should finish within this range 80 percent of the time, above it 10 percent of the time, and below it 10 percent of the time.
“This range” would seem to refer to a particular estimate, but then Nate’s interpretation is that there’s an 80% chance that the population mean falls between the specific lower and upper bound, above it 10 percent of the time, and below it 10 percent of the time. Yet this is what his definition of confidence intervals correctly calls an incorrect construal.
What do you think’s going on?
 See, for example, “Duality: Confidence intervals and the severity of tests”.
 The model being calibrated correctly, I take it, refers to the model assumptions being approximately met by the data in the case at hand.