.

Despite the fact that Fisherians and Neyman-Pearsonians alike regard observed significance levels, or P values, as error probabilities, we occasionally hear allegations (typically from those who are neither Fisherian nor N-P theorists) that P values are actually not error probabilities. The denials tend to go hand in hand with allegations that P values exaggerate evidence against a null hypothesis—a problem whose cure invariably invokes measures that are at odds with both Fisherian and N-P tests. The Berger and Sellke (1987) article from a recent post is a good example of this. When leading figures put forward a statement that looks to be straightforwardly statistical, others tend to simply repeat it without inquiring whether the allegation actually mixes in issues of interpretation and statistical philosophy. So I wanted to go back and look at their arguments. I will post this in installments.

**1. Some assertions from Fisher, N-P, and Bayesian camps**

Here are some assertions from Fisherian, Neyman-Pearsonian and Bayesian camps: (I make no attempt at uniformity in writing the “P-value”, but retain the quotes as written.)

*a) From the Fisherian camp (Cox and Hinkley):*

*For given observations ***y** we calculate t = t_{obs} = t(**y**), say, and the level of significance p_{obs} by

*p*_{obs} = Pr(T > t_{obs}; H_{0}).

*….Hence p*_{obs} is the probability that we would mistakenly declare there to be evidence against H_{0}, were we to regard the data under analysis as being just decisive against H_{0}.” (Cox and Hinkley 1974, 66).

Thus p_{obs} would be the Type I error probability associated with the test.

*b) From the Neyman-Pearson N-P camp (Lehmann and Romano):*

*“[I]t is good practice to determine not only whether the hypothesis is accepted or rejected at the given significance level, but also to determine the smallest significance level…at which the hypothesis would be rejected for the given observation. This number, the so-called p-value gives an idea of how strongly the data contradict the hypothesis. It also enables others to reach a verdict based on the significance level of their choice.” (Lehmann and Romano 2005, 63-4) *

Very similar quotations are easily found, and are regarded as uncontroversial—even by Bayesians whose contributions stood at the foot of Berger and Sellke’s argument that P values exaggerate the evidence against the null. Continue reading →