Posts Tagged With: survey data

U-PHIL: Gandenberger & Hennig: Blogging Birnbaum’s Proof

greg picDefending Birnbaum’s Proof

Greg Gandenberger
PhD student, History and Philosophy of Science
Master’s student, Statistics
University of Pittsburgh

In her 1996 Error and the Growth of Experimental Knowledge, Professor Mayo argued against the Likelihood Principle on the grounds that it does not allow one to control long-run error rates in the way that frequentist methods do.  This argument seems to me the kind of response a frequentist should give to Birnbaum’s proof.  It does not require arguing that Birnbaum’s proof is unsound: a frequentist can accommodate Birnbaum’s conclusion (two experimental outcomes are evidentially equivalent if they have the same likelihood function) by claiming that respecting evidential equivalence is less important than achieving certain goals for which frequentist methods are well suited.

More recently, Mayo has shown that Birnbaum’s premises cannot be reformulated as claims about what sampling distribution should be used for inference while retaining the soundness of his proof.  It does not follow that Birnbaum’s proof is unsound because Birnbaum’s original premises are not claims about what sampling distribution should be used for inference but instead as sufficient conditions for experimental outcomes to be evidentially equivalent.

Mayo acknowledges that the premises she uses in her argument against Birnbaum’s proof differ from Birnbaum’s original premises in a recent blog post in which she distinguishes between “the Sufficient Principle (general)” and “the Sufficiency Principle applied in sampling theory.“  One could make a similar distinction for the Weak Conditionality Principle.  There is indeed no way to formulate Sufficiency and Weak Conditionality Principles “applied in sampling theory” that are consistent and imply the Likelihood Principle.  This fact is not surprising: sampling theory is incompatible with the Likelihood Principle!

Birnbaum himself insisted that his premises were to be understood as “equivalence relations” rather than as “substitution rules” (i.e., rules about what sampling distribution should be used for inference) and recognized the fact that understanding them in this way was necessary for his proof.  As he put it in his 1975 rejoinder to Kalbfleisch’s response to his proof, “It was the adoption of an unqualified equivalence formulation of conditionality, and related concepts, which led, in my 1972 paper, to the monster of the likelihood axiom” (263).

Because Mayo’s argument against Birnbaum’s proof requires reformulating Birnbaum’s premises, it is best understood as an argument not for the claim that Birnbaum’s original proof is invalid, but rather for the claim that Birnbaum’s proof is valid only when formulated in a way that is irrelevant to a sampling theorist.  Reformulating Birnbaum’s premises as claims about what sampling distribution should be used for inference is the only way for a fully committed sampling theorist to understand them.  Any other formulation of those premises is either false or question-begging.

Mayo’s argument makes good sense when understood in this way, but it requires a strong prior commitment to sampling theory. Whether various arguments for sampling theory such as those Mayo gives in Error and the Growth of Experimental Knowledge are sufficient to warrant such a commitment is a topic for another day.  To those who lack such a commitment, Birnbaum’s original premises may seem quite compelling.  Mayo has not refuted the widespread view that those premises do in fact entail the Likelihood Principle.

Mayo has objected to this line of argument by claiming that her reformulations of Birnbaum’s principles are just instantiations of Birnbaum’s principles in the context of frequentist methods. But they cannot be instantiations in a literal sense because they are imperatives, whereas Birnabaum’s original premises are declaratives.  They are instead instructions that a frequentist would have to follow in order to avoid violating Birnbaum’s principles. The fact that one cannot follow them both is only an objection to Birnbaum’s principles on the question-begging assumption that evidential meaning depends on sampling distributions.


Birnbaum’s proof is not wrong but error statisticians don’t need to bother

Christian Hennig
Department of Statistical Science
University College London

I was impressed by Mayo’s arguments in “Error and Inference” when I came across them for the first time. To some extent, I still am. However, I have also seen versions of Birnbaum’s theorem and proof presented in a mathematically sound fashion with which I as a mathematician had no issue.

After having discussed this a bit with Phil Dawid, and having thought and read more on the issue, my conclusion is that
1) Birnbaum’s theorem and proof are correct (apart from small mathematical issues resolved later in the literature), and they are not vacuous (i.e., there are evidence functions that fulfill them without any contradiction in the premises),
2) however, Mayo’s arguments actually do raise an important problem with Birnbaum’s reasoning.

Here is why. Note that Mayo’s arguments are based on the implicit (error statistical) assumption that the sampling distribution of an inference method is relevant. In that case, application of the sufficiency principle to Birnbaum’s mixture distribution enforces the use of the sampling distribution under the mixture distribution as it is, whereas application of the conditionality principle enforces the use of the sampling distribution under the experiment that actually produced the data, which is different in the usual examples. So the problem is not that Birnbaum’s proof is wrong, but that enforcing both principles at the same time in the mixture experiment is in contradiction to the relevance of the sampling distribution (and therefore to error statistical inference). It is a case in which the sufficiency principle suppresses information that is clearly relevant under the conditionality principle. This means that the justification of the sufficiency principle (namely that all relevant information is in the sufficient statistic) breaks down in this case.

Frequentists/error statisticians therefore don’t need to worry about the likelihood principle because they shouldn’t accept the sufficiency principle in the generality that is required for Birnbaum’s proof.

Having understood this, I toyed around with the idea of writing this down as a publishable paper, but I now came across a paper in which this argument can already be found (although in a less straightforward and more mathematical manner), namely:
M. J. Evans, D. A. S. Fraser and G. Monette (1986) On Principles and Arguments to Likelihood. Canadian Journal of Statistics 14, 181-194,, particularly Section 7 (the rest is interesting, too).

NOTE: This is the last of this group of U-Phils. Mayo will issue a brief response tomorrow. Background to these U-Phils may be found here.

Categories: Philosophy of Statistics, Statistics, U-Phil | Tags: , , , ,

U-PHIL: Hennig and Gelman on Wasserman (2011)

Two further contributions in relation to

Low Assumptions, High Dimensions” (2011)

Please also see : “Deconstructing Larry Wasserman” by Mayo, and Comments by Spanos

Christian Hennig:  Some comments on Larry Wasserman, “Low Assumptions, High Dimensions”

I enjoyed reading this stimulating paper. These are very important issues indeed. I’ll comment on both main concepts in the text.

1) Low Assumptions. I think that the term “assumption” is routinely misused and misunderstood in statistics. In Wasserman’s paper I can’t see such misuse explicitly, but I think that the “message” of the paper may be easily misunderstood because Wasserman doesn’t do much to stop people from this kind of misunderstanding.

Here is what I mean. The arithmetic mean can be derived as optimal estimator under an i.i.d. Gaussian model, which is often interpreted as “model assumption” behind it. However, we don’t really need the Gaussian distribution to be true for the mean to do a good job. Sometimes the mean will do a bad job in a non-Gaussian situation (for example in presence of gross outliers), but sometimes not. The median has nice robustness properties and is seen as admissible for ordinal data. It is therefore usually associated with “weaker assumptions”. However, the median may be worse than the mean in a situation where the Gaussian “assumption” of the mean is grossly violated. At UCL we ask students on a -2/-1/0/1/2 Likert scale for their general opinion about our courses. The distributions that we get here are strongly discrete and the scale is usually interpreted as of ordinal type. Still, for ranking courses, the median is fairly useless (pretty much all courses end up with a median of 0 or 1); whereas, the arithmetic mean can still detect statistically significant meaningful differences between courses.

Why? Because it’s not only the “official” model assumptions that matter but also whether a statistic uses all the data in an appropriate manner for the given application. Here it’s fatal that the median ignores all differences among observations north and south of it. Continue reading

Categories: Philosophy of Statistics, Statistics, U-Phil | Tags: , , , ,

Blog at