.

**1.An Assumed Law of Statistical Evidence (law of likelihood)**

Nearly all critical discussions of frequentist error statistical inference (significance tests, confidence intervals, p- values, power, etc.) start with the following general assumption about the nature of inductive evidence or support:

Data *x* are better evidence for hypothesis *H*_{1} than for *H*_{0} if *x** are *more probable under *H*_{1} than under *H*_{0}.

Ian Hacking (1965) called this the **logic of support**: **x** supports hypotheses *H*_{1} more than *H*_{0} if *H*_{1} is more **likely**, given *x* than is* H0*:

Pr(*x;** H*_{1}) > Pr(*x;** H*_{0}).

[With likelihoods, the data *x* are fixed, the hypotheses vary.]*

Or,

*x* is evidence for *H*_{1} over *H*_{0 }if the **likelihood ratio** **LR** (*H*_{1} over *H*_{0 }) is greater than 1.

It is given in other ways besides, but it’s the same general idea. (Some will take the LR as actually quantifying the support, others leave it qualitative.)

In terms of rejection:

“An hypothesis should be rejected if and only if there is some rival hypothesis much better supported [i.e., much more likely] than it is.” (Hacking 1965, 89)

**2. Barnard (British Journal of Philosophy of Science )**

But this “law” will immediately be seen to fail on our minimal *severity requirement*. Hunting for an impressive fit, or trying and trying again, it’s easy to find a rival hypothesis *H*_{1} much better “supported” than *H*_{0 }even when *H*_{0} is true. Or, as Barnard (1972) puts it, “there always is such a rival hypothesis, viz. that things just had to turn out the way they actually did” (1972 p. 129). *H*_{0}: the coin is fair, gets a small likelihood (.5)^{k} given k tosses of a coin, while *H*_{1}: the probability of heads is 1 just on those tosses that yield a head, renders the sequence of k outcomes maximally likely. This is an example of Barnard’s “things just had to turn out as they did”. Or, to use an example with P-values: a statistically significant difference, being improbable under the null *H*_{0} , will afford high likelihood to any number of explanations that fit the data well.

**3.Breaking the law (of likelihood) by going to the “second,” error statistical level:**

How does it fail our severity requirement? First look at what the frequentist error statistician must always do to critique an inference: she must consider the capability of the inference method that *purports* to provide evidence for a claim. She goes to a higher level or metalevel, as it were. In this case, the likelihood ratio plays the role of the needed statistic *d*(*X*). To put it informally, she asks:

What’s the probability the method would yield an LR disfavoring *H*_{0} compared to some alternative *H*_{1} even if *H*_{0 }is true?

Continue reading →