**1.An Assumed Law of Statistical Evidence (law of likelihood)**

Nearly all critical discussions of frequentist error statistical inference (significance tests, confidence intervals, p- values, power, etc.) start with the following general assumption about the nature of inductive evidence or support:

Data ** x** are better evidence for hypothesis

*H*than for

_{1}*H*if

_{0}

*x**are*more probable under

*H*than under

_{1}*H*.

_{0}Ian Hacking (1965) called this the * logic of support: x* supports hypotheses

*H*

_{1}more than

*H*

_{0}if

*H*is more

_{1}**likely**, given

**than is**

*x**H0*:

Pr(*x;** H _{1}*) > Pr(

*x;**H*).

_{0}[With likelihoods, the data ** x** are fixed, the hypotheses vary.]*

Or,

** x** is evidence for

*H*over

_{1}*H*if the

_{0 }*(*

**likelihood ratio****LR***H*over

_{1}*H*) is greater than 1.

_{0 }It is given in other ways besides, but it’s the same general idea. (Some will take the LR as actually quantifying the support, others leave it qualitative.)

In terms of rejection:

“An hypothesis should be rejected if and only if there is some rival hypothesis much better supported [i.e., much more likely] than it is.” (Hacking 1965, 89)

**2. Barnard (British Journal of Philosophy of Science )**

But this “law” will immediately be seen to fail on our minimal *severity requirement*. Hunting for an impressive fit, or trying and trying again, it’s easy to find a rival hypothesis *H _{1}* much better “supported” than

*H*even when

_{0 }*H*is true. Or, as Barnard (1972) puts it, “there always is such a rival hypothesis, viz. that things just had to turn out the way they actually did” (1972 p. 129).

_{0}*H*: the coin is fair, gets a small likelihood (.5)

_{0}^{k}given k tosses of a coin, while

*H*: the probability of heads is 1 just on those tosses that yield a head, renders the sequence of k outcomes maximally likely. This is an example of Barnard’s “things just had to turn out as they did”. Or, to use an example with P-values: a statistically significant difference, being improbable under the null

_{1}*H*, will afford high likelihood to any number of explanations that fit the data well.

_{0}**3.Breaking the law (of likelihood) by going to the “second,” error statistical level:**

How does it fail our severity requirement? First look at what the frequentist error statistician must always do to critique an inference: she must consider the capability of the inference method that *purports* to provide evidence for a claim. She goes to a higher level or metalevel, as it were. In this case, the likelihood ratio plays the role of the needed statistic *d*(** X**). To put it informally, she asks:

What’s the probability the method would yield an LR disfavoring

Hcompared to some alternative_{0}Heven if_{1}His true?_{0 }