Allan Birnbaum: Foundations of Probability and Statistics (27 May 1923 – 1 July 1976)

27 May 1923-1 July 1976

27 May 1923-1 July 1976

Today is Allan Birnbaum’s birthday. In honor of his birthday this year, I’m posting the articles in the Synthese volume that was dedicated to his memory in 1977. The editors describe it as their way of  “paying homage to Professor Birnbaum’s penetrating and stimulating work on the foundations of statistics”. I paste a few snippets from the articles by Giere and Birnbaum. If you’re interested in statistical foundations, and are unfamiliar with Birnbaum, here’s a chance to catch up.(Even if you are,you may be unaware of some of these key papers.)


Synthese Volume 36, No. 1 Sept 1977: Foundations of Probability and Statistics, Part I

Editorial Introduction:

This special issue of Synthese on the foundations of probability and statistics is dedicated to the memory of Professor Allan Birnbaum. Professor Birnbaum’s essay ‘The Neyman-Pearson Theory as Decision Theory; and as Inference Theory; with a Criticism of the Lindley-Savage Argument for Bayesian Theory’ was received by the editors of Synthese in October, 1975, and a decision was made to publish a special symposium consisting of this paper together with several invited comments and related papers. The sad news about Professor Birnbaum’s death reached us in the summer of 1976, but the editorial project could nevertheless be completed according to the original plan. By publishing this special issue we wish to pay homage to Professor Birnbaum’s penetrating and stimulating work on the foundations of statistics. We are grateful to Professor Ronald Giere who wrote an introductory essay on Professor Birnbaum’s concept of statistical evidence and who compiled a list of Professor Birnbaum’s publications.


Table of Contents

SUFFICIENCY, CONDITIONALLY AND LIKELIHOOD In December of 1961 Birnbaum presented the paper ‘On the Foundations, of Statistical Inference’ (Birnbaum [19]) at a special discussion meeting of the American Statistical Association. Among the discussants was L. J. Savage who pronounced it “a landmark in statistics”. Explicitly denying any “intent to speak with exaggeration or rhetorically”, Savage described the occasion as “momentous in the history of statistics”. “It would be hard”, he said, “to point to even a handful of comparable events” (Birnbaum [19], pp. 307-8). The reasons for Savage’s enthusiasm are obvious. Birnbaum claimed to have shown that two principles widely held by non-Bayesian statisticians (sufficiency and conditionality) jointly imply an important consequence of Bayesian statistics (likelihood).”[1]
INTRODUCTION AND SUMMARY ….Two contrasting interpretations of the decision concept are formulated: behavioral, applicable to ‘decisions’ in a concrete literal sense as in acceptance sampling; and evidential, applicable to ‘decisions’ such as ‘reject H in a research context, where the pattern and strength of statistical evidence concerning statistical hypotheses is of central interest. Typical standard practice is characterized as based on the confidence concept of statistical evidence, which is defined in terms of evidential interpretations of the ‘decisions’ of decision theory. These concepts are illustrated by simple formal examples with interpretations in genetic research, and are traced in the writings of Neyman, Pearson, and other writers. The Lindley-Savage argument for Bayesian theory is shown to have no direct cogency as a criticism of typical standard practice, since it is based on a behavioral, not an evidential, interpretation of decisions.

[1]By “likelihood” here, Giere means the (strong) Likelihood Principle (SLP). Dotted through the first 3 years of this blog are a number of (formal and informal) posts on his SLP result, and my argument as to why it is unsound. I wrote a paper on this that appeared in Statistical Science 2014. You can find it along with a number of comments and my rejoinder in this post: Statistical Science: The Likelihood Principle Issue is Out.The consequences of having found his proof unsound gives a new lease on life to statistical foundations, or so I argue in my rejoinder.

Some content on this page was disabled on October 4, 2022 as a result of a DMCA takedown notice from Ithaka. You can learn more about the DMCA here:

Categories: Birnbaum, Error Statistics, Likelihood Principle, Statistics, strong likelihood principle

Post navigation

7 thoughts on “Allan Birnbaum: Foundations of Probability and Statistics (27 May 1923 – 1 July 1976)

  1. Steven McKinney

    Did Philip Dawid ever soften on this stance regarding your logical analysis of Binrbaum’s argument for the SLP? (Your Statistical Science paper of 2014)

    “Mayo has been attacking a straw man, and Birnbaum’s result, S & C => L, remains entirely untouched by her criticisms.”

    You discuss his critique in your rejoinder – just curious if he ever changed his mind on this. It’s a pretty stark stance.

    • Hi Steven: I never got a reaction to my carefully crafted rejoinder to him. He seemed to think all he had to do was repeat the claim I had disproved, and deny my argument could hold. I thought I showed very specifically how his reply applied to Evans but not to me. In fact, I actually turned his criticism into a criticism of Evans–one that I had escaped.

  2. Do people think the reference to Giordano Bruni on p.98 of Neyman’s paper referring (as an exaggeration) to frequentists burning Bayesian heretics or the other way around? I’ve had some discussion about different interpretations of this.

  3. Two ironic points about Neyman’s paper:
    1. J. Berger takes it to demonstrate that Neyman’s “frequentism” involves something like an empirical Bayes or diagnostic model of tests where what matters is some variation on a posterior probability or rate of error. The truth is that Neyman was reacting to Fisher and others who claimed his account was only relevant to repeated sampling from the same population. So he was at pains to show the error probabilities also held over different populations.

    2. Pearson said he was overjoyed at this paper. The irony is that it has Neyman even more behavioristic than before. The reason Pearson welcomed it is that he was severely disappointed that recent previous papers by Neyman were so focused on applications. At least this one reflects on foundations, and general principles of inference.

    Note, by the way, Neyman’s reference to Borel and Bertrand (this connects to a recent post “les miserables citations”).

  4. p. 145, Le Cam
    “It is characteristic of the pistimetric and preferential theories available
    at the present time that they do not attempt a formalization of the concept
    of experiment and tend to treat experiments and fortuitous observations
    alike. In fact, the main reason for their periodic return to fashion seems to
    be that they claim to hold the magic which permits to draw conclusions
    from whatever data and whatever features one happens to notice”.

  5. I recommend likelihoodlums and others read the Pratt article. I always find he’d already worked out all the computations people use, and I’d forgotten he tries to consider the alpha and beta in Birnbaum’s approach as data-dependent tail areas rather than prespecified. Birnbaum was never clear. The bounds on p. 66 (with likelihoods) are especially useful.

    I also recall Pratt being perhaps the only one to find a problem with Birnbaum’s argument in the very first set of comments. He suggested replacing the weak conditionality principle with another principle, but it doesn’t work.No one else seemed to get the depth of his point.

    • Carlos Ungil

      > (Pratt) tries to consider the alpha and beta in Birnbaum’s approach as data-dependent tail areas rather than prespecified. Birnbaum was never clear.

      What do you mean? (The last sentence seems to imply that Birnbaum was not clear about whether the tail areas were observed or prespecified.)

      I have a question on Birnbaum’s paper that maybe someone can answer. After discussing “three-decision tests” (p. 35):

      d1: strong evidence for H2 as against H1
      d2: neutral or weak evidence
      d3: strong evidence for H1 as against H2.

      He continues: “The ad hoc character of two-decision tests has not been eliminated, but reappears in such three-decision tests; and is illustrated once more by considering the possible alternative four-decision test which could be determined similarly by using also the test characterized by (0.02,0.02) above.”

      How would that four-decision test by defined?

Blog at