We had an excellent discussion at our symposium yesterday: “How Many Sigmas to Discovery? Philosophy and Statistics in the Higgs Experiments” with Robert Cousins, Allan Franklin and Kent Staley. Slides from my presentation, “Statistical Flukes, the Higgs Discovery, and 5 Sigma” are posted below (we each only had 20 minutes, so this is clipped,but much came out in the discussion). Even the challenge I read about this morning as to what exactly the Higgs researchers discovered (and I’ve no clue if there’s anything to the idea of a “techni-higgs particle”) — would not invalidate* the knowledge of the experimental effects severely tested.
*Although, as always, there may be a reinterpretation of the results. But I think the article is an isolated bit of speculation. I’ll update if I hear more.
The last time the issue of the p-value police came up, it was in the context of a larger argument. I wrote an objection that gave the misimpression that I was objecting to all of the elements of the argument, which led to a great deal of talking past one another. This time I won’t make that mistake.
The claim is that the phrase “there is less that a 1 in 3.5 million chance that the results are a statistical fluke” describes an ordinary error probability. (Certainly the particular numbers quoted are the result of an ordinary p-value calculation, but that’s besides the point.) It requires an enormous twisting of words to make this claim. Let’s examine the part of the construction to which a probability is being ascribed: “the results are a statistical fluke”. A simple and direct reading of this phrase is that it is tantamount to “the null hypothesis is true”. The results being referred to are actually in hand, and those specific results either are (i.e., frequentist probability one), or are not (i.e., frequentist probability zero), a statistical.fluke, in the sense that an indefinite repetition of the experiment would see the “bump” disappear. On slide 15 we find the claim that the null hypothesis “does not say that the observed results… are flukes”, but if we hold that the null hypothesis is true, then that is *exactly* what we have to conclude about the observed results! (And on slide 16 we see the substitution of the event “d(X) > 5” for the actual event proper to the p-value calculation, to wit, “d(X) > d(x_obs)”, which supports the argument by dropping “x_obs” from the picture.)
So in my view, the p-value police have the right of it. Insofar as there is some sensible and reasonable thought behind the reported phrase, that thought is best understood as a posterior probability.