“This will be my last post on the (irksome) Birnbaum argument!” she says with her fingers (or perhaps toes) crossed. But really, really it is (at least until midnight 2013). In fact the following brief remarks are all said, more clearly, in my (old) PAPER , new paper, Mayo 2010, Cox & Mayo 2011 (appendix), and in posts connected to this U-Phil: Blogging the likelihood principle, new summary 10/31/12*.
What’s the catch?
In my recent ‘Ton o’ Bricks” post,many readers were struck by the implausibility of letting the evidential interpretation of x’* be influenced by the properties of experiments known not to have produced x’*. Yet it is altogether common to be told that, should a sampling theorist try to block this, “unfortunately there is a catch” (Ghosh, Delampady, and Semanta 2006, 38): We would be forced to embrace the strong likelihood principle (SLP, or LP, for short), at least according to an infamous argument by Allan Birnbaum (who himself rejected the LP [i]).
It is not uncommon to see statistics texts argue that in frequentist theory one is faced with the following dilemma: either to deny the appropriateness of conditioning on the precision of the tool chosen by the toss of a coin, or else to embrace the strong likelihood principle, which entails that frequentist sampling distributions are irrelevant to inference once the data are obtained. This is a false dilemma. . . . The “dilemma” argument is therefore an illusion. (Cox and Mayo 2010, 298)
In my many detailed expositions, I have explained the source of the illusion and sleight of hand from a number of perspectives (I will not repeat references here). While I appreciate the care that Hennig and Gandenberger have taken in their U-Phils (and wish them all the luck in published outgrowths), it is clear to me that they are not hearing (or are unwittingly blocking) the scre-e-e-e-ching of the brakes!
No revolution, no breakthrough!
Berger and Wolpert, in their famous monograph The Likelihood Principle, identify the core issue:
The philosophical incompatibility of the LP and the frequentist viewpoint is clear, since the LP deals only with the observed x, while frequentist analyses involve averages over possible observations. . . . Enough direct conflicts have been . . . seen to justify viewing the LP as revolutionary from a frequentist perspective. (Berger and Wolpert 1988, 65-66)[ii]
If Birnbaum’s proof does not apply to a frequentist sampling theorist, then there is neither a revolution nor a breakthrough (as Savage called it). The SLP holds just for methodologies in which it holds . . . We are going in circles.
Block my counterexamples, please!
Since Birnbaum’s argument has stood for over fifty years, I’ve given it the maximal run for its money, and haven’t tried to block its premises, however questionable its key moves may appear. Despite such latitude, I’ve shown that the “proof” to the SLP conclusion will not wash, and I’m just a wee bit disappointed that Hennig and Gandenberger haven’t wrestled with my specific argument, or shown just where they think my debunking fails. What would this require?
Since the SLP is a universal generalization, it requires only a single counterexample to falsify it. In fact, every violation of the SLP within frequentist sampling theory, I show, is a counterexample to it! In other words, using the language from the definition of the SLP, the onus is on Birnbaum to show that for any x’* that is a member of an SLP pair (E’, E”) with given, different probability models f’, f”, that x’* and x”* should have the identical evidential import for an inference concerning parameter q–, on pain of facing “the catch” above, i.e., being forced to allow the import of data known to have come from E’ to be altered by unperformed experiments known not to have produced x’*.
If one is to release the breaks from my screeching halt, defenders of Birnbaum might try to show that the SLP counterexamples lead me to “the catch” as alleged. I have considered two well-known violations of the SLP. Can it be shown that a contradiction with the WCP or SP follows? I say no. Neither Hennig[ii] nor Gandenberger show otherwise.
In my tracing out of Birnbaum’s arguments, I strived to assume that he would not be giving us circular arguments. To say that “I can prove that your methodology must obey the SLP,” and then to set out to do so by declaring “Hey Presto! Assume sampling distributions are irrelevant (once the data are in hand),” is a neat trick, but it assumes what it purports to prove. All other interpretations are shown to be unsound.
[i] Birnbaum himself, soon after presenting his result, rejected the SLP. As Birnbaum puts it, ”the likelihood concept cannot be construed so as to allow useful appraisal, and thereby possible control, of probabilities of erroneous interpretations.” (Birnbaum 1969, p. 128.)
(We use LP and SLP synonymously here.)
[ii] Hennig initially concurred with me, but says a person convinced him to get back on the Birnbaum bus (even though Birnbaum got off it [i]).
Some other, related, posted discussions: Brakes on Breakthrough Part 1 (12/06/11) & Part 2 (12/07/11); Don’t Birnbaumize that experiment (12/08/12); Midnight with Birnbaum re-blog (12/31/12). The initial call to this U-Phil, the extension, details here, the post from my 28 Nov. seminar, (LSE), and the original post by Gandenberger,
Birnbaum, A. (1962), “On the Foundations of Statistical Inference“, Journal of the American Statistical Association 57 (298), 269-306.
Savage, L. J., Barnard, G., Cornfield, J., Bross, I, Box, G., Good, I., Lindley, D., Clunies-Ross, C., Pratt, J., Levene, H., Goldman, T., Dempster, A., Kempthorne, O, and Birnbaum, A. (1962). On the foundations of statistical inference: “Discussion (of Birnbaum 1962)”, Journal of the American Statistical Association 57 (298), 307-326.
Birbaum, A (1970). Statistical Methods in Scientific Inference (letter to the editor). Nature 225, 1033.
Cox D. R. and Mayo. D. (2010). “Objectivity and Conditionality in Frequentist Inference” in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D Mayo & A. Spanos eds.), CUP 276-304.
…and if that’s not enough, search this blog.
Mayo: Well, your counterexamples are not counterexamples to the mathematical content of the proof. Both premises of Birnbaum’s proof can be fulfilled by functions Ev which do not depend on the sampling distribution (such as, trivially but not meaningfully in terms of interpretation, a constant function). And for such functions the proof is valid.
Your “counterexamples” are based on premises that add something to the purely mathematical content of the CP and SP as formulated by Birnbaum, by enforcing the inference to *differ* between what is yielded by the suffient statistic in the mixture experiment regarded as a whole and what is yielded in the one of the two mixed experiments that actually brought forth the observed data. You are right in saying that a *reasonable* error statistical inference should differ between the two, but Birnbaum’s original formulation doesn’t enforce this for the function Ev.
So your counterexamples do *not* fulfill the original CP and SP as formulated by Birnbaum (although one can argue that they fulfill a reasonable worded “SP for sampling distributions”) and can therefore not invalidate the proof, and neither can you say that Birnbaum’s original premises cannot both be fulfilled at the same time without invalidating the proof, because functions Ev exist which do fulfill CP and SP and which do not invalidate the argument in the proof stripped of any interpretative implications (you may say that these functions have to suppress information that is needed for good inference, but that’s a problem with interpretation, not with mathematics).
This became clear to me only when I discussed this with P. Dawid who showed me a purely mathematical proof whereas Birnbaum’s original proof (against which I had initially read your arguments) still uses some “interpretative” jargon such as “having evidential meaning” that to me seems to obscure mathematical matters.