Christian Robert’s *reply* grows out of my last blogpost. On Xi’an’s Og :

A quick reply from my own Elba, in the Dolomiti: your arguments (about the sad consequences of the SLP) are not convincing wrt the derivation of SLP=WCP+SP. If I built a procedure that reports (E

_{1},x*) whenever I observe (E_{1},x*) or (E_{2},y*), this obeys the sufficiency principle; doesn’t it? (Sorry to miss your talk!)

Mayo’s response to Xi’an on the “sad consequences of the SLP.”[i]

This is a useful reply (so to me it’s actually not ‘flogging’ the SLP[ii]), and, in fact, I think Xi’an will now see why my arguments are convincing! Let’s use Xi’an’s procedure to make a parametric inference about q. Getting the report x* from Xi’an’s procedure, we know it could have come from E_{1} or E_{2}. In that case, the WCP forbids us from using either individual experiment to compute the inference implication. We use the sampling distribution of T_{B}.

Birnbaum’s statistic T_{B} is a technically sufficient statistic for Birnbaum’s experiment E_{B } (the conditional distribution of Z given T_{B} is independent of q). The question of whether this is the relevant or legitimate way to compute the inference when it is given that y* came from E_{2 }is the big question. The WCP says it is not. Now you are free to use Xi’an’s procedure (free to Birnbaumize) but that does not yield the SLP. Nor did Birnbaum think it did. That’s why he goes on to say: “Never mind. Don’t use Xi’an’s procedure. Compute the inference using E_{2 } just as the WCP tells you to. You know it came from E_{2 }. Isn’t that what David Cox taught us in 1958?”

Fine. But still no SLP! Note it’s not that SP and WCP conflict, it’s WCP and Birnbaumization that conflict. The application of a principle will always be relative to the associated model used to frame the question.[iii]

These points are all spelled out clearly in my paper: [I can’t get double subscripts here. E_{B }is the same as E-B][iv]

Given

y*, the WCP says do not Birnbaumize. One is free to do so, but not to simultaneously claim to hold the WCP in relation to the giveny*, on pain of logical contradiction. If one does choose to Birnbaumize, and to construct T_{B}, admittedly, the known outcomey* yields the same value of T_{B}as wouldx*. Using the sample space of E_{B}yields: (B): Infr_{E-B}[x*] = Infr_{E-B}[y*]. This is based on the convex combination of the two experiments, and differs from both Infr_{E1}[x*] and Infr_{E2}[y*]. So again, any SLP violation remains. Granted, if only the value of T_{B}is given, using Infr_{E-B}may be appropriate. For then we are given only the disjunction: Either (E_{1},x*) or (E_{2},y*). In that case one is barred from using the implication from either individual E_{i}. A holder of WCP might put it this way: once (E,z) is given, whether E arose from a q-irrelevant mixture, or was fixed all along, should not matter to the inference; but whether a result was Birnbaumized or not should, and does, matter.There is no logical contradiction in holding that if data are analyzed one way (using the convex combination in E

_{B}), a given answer results, and if analyzed another way (via WCP) one gets quite a different result. One may consistently apply both the E_{B }and the WCP directives to the same result, in the same experimental model, only in cases where WCP makes no difference. To claim the WCP never makes a difference, however, would entail that there can be no SLP violations, which would make the argument circular. Another possibility, would be to hold, as Birnbaum ultimately did, that the SLP is “clearly plausible” (Birnbaum 1968, 301) only in “the severely restricted case of a parameter space of just two points” where these are predesignated (Birnbaum 1969, 128). But SLP violations remain.

Note: The final draft of my paper uses equations that do not transfer directly to this blog. Hence, these sections are from a draft of my paper.

[i] Although I didn’t call them “sad,” I think it would be too bad to accept the SLP’s consequences. Listen to Birnbaum:

The likelihood principle is incompatible with the main body of modern statistical theory and practice, notably the Neyman-Pearson theory of hypothesis testing and of confidence intervals, and incompatible in general even with such well-known concepts as standard error of an estimate and significance level. (Birnbaum 1968, 300)

That is why Savage called it “a breakthrough” result. In the end, however, Birnbaum could not give up on control of error probabilities. He held the SLP only for the trivial case of predesignated simple hypotheses. (Or, perhaps he spied the gap in his argument? I suspect, from his writings, that he realized his argument went through only for such cases that do not violate the SLP.)

[ii] Readers may feel differently.

[iii] Excerpt from a draft of my paper:

*Model checking. *An essential part of the statements of the principles SP, WCP, and SLP is that the validity of the model is granted as adequately representing the experimental conditions at hand (Birnbaum 1962, 491). Thus, accounts that adhere to the SLP are not thereby prevented from analyzing features of the data such as residuals, which are relevant to questions of checking the statistical model itself. There is some ambiguity on this point in Casella and R. Berger (2002):

Most model checking is, necessarily, based on statistics other than a sufficient statistic. For example, it is common practice to examine residuals from a model. . . Such a practice immediately violates the Sufficiency Principle, since the residuals are not based on sufficient statistics. (Of course such a practice directly violates the [strong] LP also.) (Casella and R. Berger 2002, 295-6)

They warn that before considering the SLP and WCP, “we must be comfortable with the model” (296). It seems to us more accurate to regard the principles as inapplicable, rather than violated, when the adequacy of the relevant model is lacking.

Birnbaum, A.1968. “Likelihood.” In *International Encyclopedia of the Social Sciences*, 9:299–301. New York: Macmillan and the Free Press.

———. 1969. “Concepts of Statistical Evidence.” In *Philosophy, Science, and Method: Essays in Honor of Ernest Nagel*, edited by S. Morgenbesser, P. Suppes, and M. G. White, 112–143. New York: St. Martin’s Press.

Casella, G., and R. L. Berger. 2002. *Statistical Inference*. 2nd ed. Belmont, CA: Duxbury Press.

Mayo 2013, (http://arxiv-web3.library.cornell.edu/pdf/1302.7021v2.pdf)

Hopefully I’m not too late in posting this for it not to be an annoyance, but I am very confused about some things and I was hoping you could clarify. First, the story, as I understand it is:

As statisticians, in the background for each of us is some inference function Inf(E, y) such that when I plug in some experiment E and data realization y, an inference pops out. You, caring about things like sampling distributions, will specify your inference function Inf(E, z) such that things like the sampling distribution in the experiment matter quite a bit.

My inference paradigm obeys the WCP if in a mixed experiment E_{mix} my inference function obeys Inf(E_{mix}, (1, y)) = Inf(E_1, y) and Inf(E_{mix}, (2, y)) = Inf(E_2, y) for all y’s, where my data (j, y) is such that j tells me which experiment I did and y is the realization from that experiment. My inference paradigm obeys the SP, on the other hand, if whenever T(y) is a sufficient statistic for an experiment E, and T(y) = T(y’), then Inf(E, y) = Inf(E, y’). If either my rough characterization of WCP or SP is off, feel free to stop reading and correct me (if you have the time).

Birnbaum then constructs a sufficient statistic for E_{mix}, T, such that T(1, y’) = T(2, y”) for any y’ and y” giving rise to proportional likelihoods in E_1 and E_2. Hence, if we obey the SP and WCP, we have Inf(E_1, y’) = Inf(E_{mix}, (1, y’)) = Inf(E_{mix}, (2, y”)) = Inf(E_2, y”) by (WCP, SP, WCP).

Now, in your counterexample you are considering E_1 as a fixed design and E_2 as an optional stopping experiment, and you appear to be taking Inf(E, y) for all relevant experiments to have the range [0, 1], apparently basing inferences on a p-values. You seem to agree that for the value of y considered in both experiments, Inf(E_1, y) = 0.05 and Inf(E_2, y) = 0.37, following from the sampling distribution after conditioning on the experiment.

You also accept the WCP, and hence your inference function Inf(E, y) must satisfy the equalities Inf(E_{mix}, (1, y)) = 0.05 and Inf(E_{mix}, (2, y)) = 0.37. But why does this not violate the sufficiency principle? I consider the mixed experiment E_{mix}, construct my sufficient statistic T, a la Birnbaum, and note that T(1, y) = T(2, y). If I am obeying SP, I must have Inf(E_{mix}, (1, y)) = Inf(E_{mix}, (2, y)); but 0.05 is not equal to 0.37 and hence your inference paradigm Inf(E, y) does not obey SP since there exists an experiment E with sufficient statistic T, and points y and y’ such that T(y) = T(y’) but Inf(E, y) = Inf(E, y’) fails.

If you could point out the error in my thinking, I would be grateful. A point of serious confusion for me is that your argument seems to involve arguing that Xi’an is implicitly arguing you must use the sampling distribution of T. Xi’an’s arguments aren’t imploring you to use the unconditional sampling distribution of T to make an inference; once you have specified your Inf(E, y), all the work is done and it either obeys SP and WCP as a function, or it does not – if Birnbaum’s argument goes through, then your Inf(E, y) does not.

t: Not too late, but each of your queries is dealt with in a much, much clearer and more specific way in my paper:

http://arxiv-web3.library.cornell.edu/pdf/1302.7021v2.pdf

than I can ever do on a blog comment.The SP always applies within a model of an experiment. If one draws inference implications based on the E-B (hypothetical mixture) model, the SLP pairs are inferentially equivalent (within E-B). Does that demonstrate the SLP? Why not? Why didn’t Birnbaum just stop there? Please try going back to the new paper—the product of months and months of clarifying precisely these points. Please write again if you’re still not convinced (as to the counterexamples to)

(SP + WLP) entails SLP.

t: so shall I assume you now see the light? (hopefully so). It’s true that “if Birnbaum’s argument goes through, then” mine does not (and if everything is grass, then all pencils are grass). So, while I’m swamped at the moment, I’m not dismissing your query, whoever you are. There have also been numerous discussions by others on this issue that can be searched on this blog.

I haven’t gotten around to going over your paper with a fine toothed comb and wouldn’t want to waste your time with a response until I did so that I don’t ask for clarification on something that you address in the paper. I do have questions/concerns about the arguments, and I’d be happy to have you clarify as much as you are willing to once I’ve gone over it in detail. The concern on my end is that you are presumably a very busy person and it might not be worth your time to go back-and-forth on an inefficient medium.

My initial impression is that things would be much clearer to me if the paper were from a more mathematical perspective. Given that you are making “radical” statements, it would be very helpful to provide additional mathematical rigor as then it might be easier to pinpoint where Birnbaum is going wrong; I would be surprised, though, if what you are proposing is the same formalism that appears in (say) Berger and Wolpert – which is perhaps the point? Your conclusions would be more believable if the argument is that the usual formalism isn’t appropriate, and in the “correct” formalism the implication fails, which is what is going on in Michael Evan’s recent paper on the arXiv – interpreting SP, WCP, and SLP as distinct relations on the collection of inference bases and that the relevant statement is SP \cup WCP \iff SLP, which fails (I didn’t find this particularly convincing, not that anyone should care what I think).

Mr.t: Instead of giving your vague initial impressions,and expressions of confident disbelief, why don’t you please read my latest paper. It’s a mere 18 pages, quite painless, and contains the relevant formalism. Birnbaum’s argument was quite informal as is the central principle on which it is based. In fact, the root of the whole problem concerns some equivocations of language and of formal logic! (If statisticians used quantifiers the problem would have never occurred.) I know Birnbaum’s work very well, have studied his original papers and letters (which I have thanks to R. Giere who worked with him at NYU), and, as a logician, I think I bring out the key issue that is hidden in the typical presentation. I can’t do your work for you,t, and shouldn’t have to appeal to authority, but as you can see in the comments, Evans agrees with me.Also of relevance:

• Mayo, D. G. (2010). “An Error in the Argument from Conditionality and Sufficiency to the Likelihood Principle”

Cox D. R. and Mayo. D. G. (2010). “Objectivity and Conditionality in Frequentist Inference”

Well, the main point of my response was to let you know why I hadn’t responded, and inform of my intention to read the paper in detail. Until you drew attention to my silence I intended only to respond after I *had* read your paper in detail. Apparently something in it rubbed you the wrong way, so I’m sorry.

My comments about mathematical rigor had more to do with the fact your paper doesn’t work on a set-theoretic formalism the way (say) Evans does in the paper I referenced. I thought writing in that style might be more convincing to other since he is very explicit about the mathematical objects he considers.

t: Nothing rubbed me the wrong way. I’m glad you’ll read the paper. On set-theoretical notation, if you look at the early Evans, Fraser, and Monette paper, you’ll see they raise an issue about why Birnbaum’s notation is/(appears to them to be) at odds with the more usual set-theoretic notation. Anyway, I follow Birnbaum’s (in)formalism. Thanks.

I’m in the process of an attempt to read your paper very closely, but I’m very confused by your notation. I’ve read closely up to page 15, but the confusion is building early. What exactly is Infr_{E}[z] as a mathematical object, and what does the arrow buy us? It seems on page 15 that you state that your (arrow) is actually a function while Infr_{E} seems to accept arguments from all sorts of different domains. For an explicit example, I have no idea what Infr_{E_i}[(E_mix, z_i)] on page 11 is – to say nothing of monsters like Infr_{E_i}[(E_i & Irrel, z_i)] on page 12 (I do understand what each of the individual players in the statement are, just not the way they are put together in that expression). Is (E, z) (arrow) playing the role that Ev(E, z) plays for Berger and Wolpert? If not, what is playing that role?

Another point of confusion: What exactly is the difference between E_B and E_{mix}? Both are claimed to result from the “convex combination” of E_1 and E_2. When you refer to a convex combination of E_1 and E_2, I presume you mean the collection of convex combinations of the associated probability measures making up E_1 and E_2? But the convex combination of the pms does not carry information on which of E_1 and E_2 is performed, and it’s clear that any sampling-theory inference based on the convex combination can’t possibly obey WCP if we actually know which experiment is performed, so perhaps this is not what you mean? I presume you don’t mean the convex combination of the inferences, as this makes no sense to me if the space of inferences is not a field – but on page 5 you state “Infr_{E_mix}[z] is always understood as the convex combination over the elements of the mixture”?

I’m really making an honest effort to understand what you’ve written, but it is frustrating not to understand your language. I’ve also noted what are, I think, typos and I can list them in another post if it would be helpful to you.

t: I cannot make out your symbols on comments: I recommend you write to me directly at error@vt.edu. Sorry, but I am unable to spend time going back over this material for anonymous commentators, and in any event, blog comments are a terrible way to communicate symbolic materials. If you write to me directly, there’s a much greater chance of avoiding further misunderstandings.

If you do not wish to do so, I can only say that if you would read carefully,the answers to the questions you raise should be evident. This material may be found discussed by myself and others on this blog (search SLP). Most of the Birnbaum references are also attached to blogposts.

Thanks for your interest.