My paper, “On the Birnbaum Argument for the Strong Likelihood Principle” has been accepted by *Statistical Science*. The latest version is here. (It differs from all versions posted anywhere). If you spot any typos, please let me know (error@vt.edu). If you can’t open this link, please write to me and I’ll send it directly. As always, comments and queries are welcome.

I appreciate considerable feedback on SLP on this blog. Interested readers may search this blog for quite a lot of discussion of the SLP (e.g., here and here) including links to the central papers, “U-Phils” (commentaries) by others (e.g., here, here, and here), and amusing notes (e.g., Don’t Birnbaumize that experiment my friend, and Midnight with Birnbaum), and more…..

Abstract: An essential component of inference based on familiar frequentist notions, such as p-values, significance and confidence levels, is the relevant sampling distribution. This feature results in violations of a principle known as the strong likelihood principle (SLP), the focus of this paper. In particular, if outcomes

x^{∗}andy^{∗}from experimentsE_{1}andE_{2}(both with unknown parameterθ), have different probability modelsf_{1}( . ),f_{2}( . ), then even thoughf_{1}(x^{∗};θ) = cf_{2}(y^{∗};θ) for allθ, outcomesx^{∗}andy^{∗}may have different implications for an inference aboutθ. Although such violations stem from considering outcomes other than the one observed, we argue, this does not require us to consider experiments other than the one performed to produce the data. David Cox (1958) proposes the Weak Conditionality Principle (WCP) to justify restricting the space of relevant repetitions. The WCP says that once it is known whichEproduced the measurement, the assessment should be in terms of the properties of_{i}E. The surprising upshot of Allan Birnbaum’s (1962) argument is that the SLP appears to follow from applying the WCP in the case of mixtures, and so uncontroversial a principle as sufficiency (SP). But this would preclude the use of sampling distributions. The goal of this article is to provide a new clarification and critique of Birnbaum’s argument. Although his argument purports that [(WCP and SP), entails SLP], we show how data may violate the SLP while holding both the WCP and SP. Such cases also refute [WCP entails SLP]._{i}

Key words:Birnbaumization, likelihood principle (weak and strong), sampling theory, sufficiency, weak conditionality

** **

Hi Mayo.

Could you comment this paper any day: http://arxiv.org/pdf/1311.0081.pdf

It seems to me the, even though he defends p-values, his approach is completely different from yours (he seems a likelihoodist).

I would appreciate your comments on this.

Best Regards

Hey, neat — the author of that paper is Michael Lew, our local likelihoodlum.

On a quick scan I note the (erroneous) allegation that p-values aren’t error probabilities, and some of the standard alarms about “hybridization”, happily dismissing what Fisher, Neyman and Pearson have said.

I read the paper, and my impression is that he only sees “error” as mistakes in decisions or interpretations. This seems to be different in kind to “error” as in measurement error. I think he is making Fisher`s arguments regarding how to take meaning thru the likelihood function. But, I would call it “error” probability when it measures the probability we should see a “higher” p-value than observed when the null hypothesis is true. I do not think that perspective was considered in the paper.

john byrd: Yup. It’s a disagreement about what kinds of quantities deserve the label “error probability” — mere semantics. The really interesting differences lie elsewhere.

No, I take it back. It makes a substantive difference to his argument on pg 18.

I also think his defense is quite poor, and it is actually based only on t.test simulations.

Mayo: I can’t see the paper when I click the link. Maybe I’d need to sign in/create an account if I don’t have one already?

Christian: I sent it to you directly, I have no idea why the attachment wouldn’t work. It’s supposed to. Sorry. Mayo

Mayo: I read it. I get it. (Finally.)

Corey: Which paper are you referring to having read and gotten? Birnbaum (I hope)?

Mayo: Yup, Birnbaum.

All: We have revised how the Birnbaum paper is linked, it should be open to all now. Please write to me, if you can’t open it and I’ll attach a copy.

Deborah,

Congratulations for the acceptation of your paper. Will it be a discussion paper?

To a sampling theorist, the Birnbaum’s argument excludes the frequentist inference from the very beginning, isn’t?

Since for the Birnbaum’s argument to be sound, we must let aside the underlying sampling distributions.

Yes, it will be a discussion paper. No Birnbaum’s argument is certainly not intended to preclude the frequentist inference–quite the opposite. As I note in my paper, Birnbaum describes the result as relevant for non-Bayesian, non-likelihoodist, statistics. The WCP is a frequentist principle, and the reason Savage deemed the result a “breakthrough” is that he felt that people wouldn’t stop at the halfway house of likelihoods, but would go all the way to Bayesian inference—once they saw Birnbaum. Of course if one already accepts the SLP, then they don’t need Birnbaum to prove it follows from SP and WCP.

You say for Birnbaum’s argument to be sound, we must ignore sampling distributions. To be sound, it must be valid and have all true premises, but this is impossible–if it’s an argument for the SLP as claimed. To say, on the other hand, that “I have shown that in order to infer we should ignore sampling distributions, we must assume we should ignore sampling distributions” is to say, I’ve given no proof of it at all. Birnbaum didn’t intend that. He would have otherwise been done on day one. In fact he struggled with different ways to prove it. Still, the argument is not so trivial to see through because it involves one or more subtle equivocations, that I discuss in my earlier (2010) paper, intended, perhaps, more for philosophers.

Thanks Deborah,

What I meant to say was that, restricted to the frequentist inferences,

1. If (WCP), then not-(SP)

or

2. if (SP), then not-(WCP).

That is to say:

(WCP) & (SP) => not-(frequentist inferences)

No, of course not. I take SP to be sufficiency principle by the way.

SP is sufficiency principle, is that what you meant?

Well the link to my paper is now fixed, last chance for corrections. Thanx.

OK, finally I had the time to read it.

I’m fine with the argument.

I’ll make a quick attempt to explain how this is compatible with what I’ve claimed over the last few months, namely that Birnbaum’s proof is *mathematically* correct, although it doesn’t have the interpretation that Birnbaum thought it has.

In Birnbaum’s paper there is Ev(E,x), and in yours there is Infr_E(x). Birnbaum later has Ev(E*,(E,x)) and you have something like Infr_E_mix(E,x), where Birnbaum’s E* in your paper is rather called E_mix or E_B. These look like just different notations of the same thing, but they are not.

In your paper, Infr_E_mix(E,x) means that the given information is that experiment E has been carried out within E_mix, but that inference is to be computed based on the sampling distribution belonging to E_mix. Birnbaum has no notation for this. In his paper, Ev(E*,(E,x)) means that experiment E has been carried out within E*, but it does *not* mean that the inference should take into account that the underlying sampling distribution of E* should be used (note that I use the word “meaning” here to refer to how Birnbaum defined Ev mathematically, *not* to the intended interpretation). Therefore, Birnbaum makes the WCP imply Ev(E*,(E,x))=Ev(E,x), whereas in your paper Infr_E_mix(E,x) is not equal to Infr_E(x).

Birnbaum’s notation forces the mathematical formalism to ignore the sampling distribution of E* and therefore it finally allows him in the proof to forget that x was generated within E*, whereas your notation makes the reader realise that the information that we are within E_mix cannot easily be dropped.

You may say that my reading of Birnbaum implies that Birnbaum’s argument is circular, because it is not *proved* that the sampling distribution can be ignored, it is rather an artifact of Birnbaum’s notation. Fair enough. Still the problem is not in the mathematics. Birnbaum’s Ev can be a mathematically legitimate object used in a legitimate proof; only the problem is that if this is done, it can’t be interpreted as having all the information content that it should have if one would wanted to identify it with frequentist inference.

Anyway, as said before, I’m happy enough with your paper and if you want to insist that Birnbaum got the *mathematical proof* wrong, I’ll keep quiet about this from now on but still feel slightly smug about having understood not only your argument but also what to say about a certain line of defence of Birnbaum that you haven’t directly addressed.

Christian: Let me express my thanks to you in these discussions.

One last thing: it is an interesting idea to claim it is “not in the mathematics”. What is one doing in proving a claim, mathematics or metamathematics? It can be described as both. The unusual thing here is that although the argument is “within” the logical formalism (of deductive logic), the fact that it considers “principles” (of statistical inference) which Birnbaum himself describes as merely “intuitive”, and the fact that there are rival choices of the principles one might adopt, does bring the discussion, I think, into metamathematics, or metalogic. And my criticism of it is also metalogical, as it must be.

In my argument I separate the formally defined mathematical object Ev from its interpretation. Birnbaum proved a theorem about a mathematical object Ev as *formally* defined in his paper (ignoring all informal stuff that was written around it to explain what it means), assuming formally defined principles WCP and SP (ignoring that somebody may find them “intuitive” or could be exopected to “accept” them for general evidence). I accept that without the interpretation of Ev(E,x) as general formalisation of evidence from an experiment E yielding x the “Theorem” can be seen as not particularly interesting. So I fully agree with your criticism of what people (including, to some extent, Birnbaum himself) have made of it.

Probably, by characterising your argument as “metalogical”, we agree anyway. I understand that, although your formal definitions are slightly different from Birnbaum’s, you have a legitimate claim stating that yours are rather what Birnbaum should have considered, given what he was after.

Perhaps we’ll never really know what he took himself to be doing. I’ve been through his papers, notes, and what letters i have multiple times, getting shades of meaning. I can show where he admits it won’t go through with WCP (early on), then tries censoring, then says essentially nothing on it for a few years, then declares it only holds for severely restricted cases such as predesignated point against point hypotheses. Then he commits suicide.

Christian: I like your point, that his notation results in precluding the sampling distribution from being taken account of. Incidentially, my notation, originally, also allowed the dual reading, which I thought was a good way to show how the flaw creeps in. But it appeared not everyone liked the philosopher’s strategy of showing equivocation, so I fixed the meaning of an “inference implication” under E-B, so that it was not allowed to shift in the argument.

On the other point, I’ll just say that in order to prove a claim C, from premises A, B, you need (A and B) entails C is a theorem. Having that, it follows that accepting A and B leads to inferring C. But if A and B were allowed to be anything logically equivalent to C, then you could prove anything.

In a related argument, Evans recently argues that there is a gap preventing one from moving from A and B to C. I demonstrate the gap by showing you can hold A, B and not-C.

Even though I get your point, I think it is incorrect (and unfair) for people to claim now that they weren’t misled when they thought he’d derived C (the SLP) from principles accepted by sampling theorists. They most certainly did think he’d shown that, and it shouldn’t be alleged that everyone knew it was just arguing in a circle. One needs to reread the profundity attached to this result, as reiterated by Savage, Berger and Wolpert and others.