Sitting in the airport . . . a temporary escape from Elba, which I’m becoming more and more loathe to leave. I fear that some might agree, rightly, that Kadane’s “trivial test” is no indictment of significance tests and yet for the WRONG reason. I don’t want to beat a dead horse, but perhaps a certain confusion is going to obstruct understanding later on. Let us abbreviate “tails” on a coin toss that lands tails 5% of the time, as “a rare coin toss outcome”. Some seem to reason: since a rare coin toss outcome is an event with probability .05 REGARDLESS of the truth or falsity of a hypothesis H, then the test is still a legitimate significance test with significance level .05; it is just a lousy one, with no discriminating ability. I claim it is no significance test at all, and that there is an important equivocation going on (in some letters I’ve received)—one which I hoped would be skirted by the analogy with ordinary hypothesis testing in science. Heading off this confusion was the key rationale for my discussion in the Kuru post. Finding no nucleic acid in prions is inconsistent, or virtually so, under the hypothesis H: all pathogens are transmitted with nucleic acid. The observed results are anomalous for the central dogma H BECAUSE they are counter to what H says we would expect. If you maintain that the “rare coin toss outcome” is anomalous for a statistical null hypothesis H, then you would also have to say they are anomalous for H: all pathogens have nucleic acid. But it is obvious this is false in the case of the scientific hypothesis. It must also be rejected in the case of the statistical hypothesis (Rule #1).
A legitimate statistical test hypothesis must tell us (i.e., let us compute) how improbably far different experimental outcomes are from what would be expected under H. It is correct to regard experimental results as anomalous for a hypothesis H only if, and only because, they run counter to what H tells us would occur in a universe where H is correct. A hypothesis on pathogen transmission, say, does not tell us the improbability of the rare coin toss outcome. Thus it is no significance test at all. As I wrote in the Kuru post: It is not that infectious protein events are “very improbable” in their own right (however one construes this); it is rather that these events are counter to, and forbidden under, the assumption of the hypothesis H.
But what about randomized tests to obtain an arbitrary alpha level for discrete random variables?
Classical books like Lehamn or Hogg and Craig present them in the Neyman Pearson context of hypothesis testing. Wouldn’t it be somtehing similar to Kadanes “joke”?
I understand that your (and Spanos) approach to unify Fisher and NP would probably exclude randomized tests because they are not inductive inference, but the “classical” tradition presents alpha as type I error and prsents randomized tests to control type I erros… So reductio ad absurdum, kadane’s joke is not that wrong, isn’t it?
Hi, Mayo… I don’t know if I made myself clear last time.
What do you think about randomized tests?
When you have a discrete distribution for your statistic, if you want an arbitrary size for the test, you need to randomize the critical region of the test. Isn’t it as silly as Kadane has put it?
An error statistics approach would recomend this kind of test?