I said I’d make some comments on Birnbaum’s letter (to Nature), (linked in my last post), which is relevant to today’s Seminar session (at the LSE*), as well as to (Normal Deviate‘s) recent discussion of frequentist inference–in terms of constructing procedures with good long-run “coverage”. (Also to the current U-Phil).
NATURE VOL. 225 MARCH 14, 1970 (1033)
LETTERS TO THE EDITOR
Statistical methods in Scientific Inference
It is regrettable that Edwards’s interesting article, supporting the likelihood and prior likelihood concepts, did not point out the specific criticisms of likelihood (and Bayesian) concepts that seem to dissuade most theoretical and applied statisticians from adopting them. As one whom Edwards particularly credits with having ‘analysed in depth…some attractive properties” of the likelihood concept, I must point out that I am not now among the ‘modern exponents” of the likelihood concept. Further, after suggesting that the notion of prior likelihood was plausible as an extension or analogue of the usual likelihood concept (ref.2, p. 200), I have pursued the matter through further consideration and rejection of both the likelihood concept and various proposed formalizations of prior information and opinion (including prior likelihood). I regret not having expressed my developing views in any formal publication between 1962 and late 1969 (just after ref. 1 appeared). My present views have now, however, been published in an expository but critical article (ref. 3, see also ref. 4) , and so my comments here will be restricted to several specific points that Edwards raised.
If there has been ‘one rock in a shifting scene’ or general statistical thinking and practice in recent decades, it has not been the likelihood concept, as Edwards suggests, but rather the concept by which confidence limits and hypothesis tests are usually interpreted, which we may call the confidence concept of statistical evidence. This concept is not part of the Neyman-Pearson theory of tests and confidence region estimation, which denies any role to concepts of statistical evidence, as Neyman consistently insists. The confidence concept takes from the Neyman-Pearson approach techniques for systematically appraising and bounding the probabilities (under respective hypotheses) of seriously misleading interpretations of data. (The absence of a comparable property in the likelihood and Bayesian approaches is widely regarded as a decisive inadequacy.) The confidence concept also incorporates important but limited aspects of the likelihood concept: the sufficiency concept, expressed in the general refusal to use randomized tests and confidence limits when they are recommended by the Neyman-Pearson approach; and some applications of the conditionality concept. It is remarkable that this concept, an incompletely formalized synthesis of ingredients borrowed from mutually incompatible theoretical approaches, is evidently useful continuously in much critically informed statistical thinking and practice.(emphasis mine)
While inferences of many sorts are evident everywhere in scientific work, the existence of precise, general and accurate schemas of scientific inference remains a problem. Mendelian examples like those of Edwards and my 1969 paper seem particularly appropriate as case-study material for clarifying issues and facilitating effective communication among interested statisticians, scientific workers and philosophers and historians of science.
New York University
Courant Institute of Mathematical Sciences,
251 Mercer Street,
New York, NY 10012
Possibly Birnbaum’s confidence concept, sometimes written (Conf), is what Normal Deviate has in mind (as a key rquirement of frequentist inference?). In Birnbaum 1977 (24), he states it more fully as follows:
(Conf): A concept of statistical evidence is not plausible unless it finds ‘strong evidence for J as against H with small probability (α) when H is true, and with much larger probability (1 – β) when J is true.
Birnbaum says N-P methods do not have “concepts of evidence”–a term that he seems to have invented–essentially simply because Neyman talked of “inductive behavior” and Wald and others cauched statistical methods in decision-theoretic terms. I have been urging that we consider instead how the tools may actually be used, and not be restricted by the statistical philosophies of such and such founder (not to mention that so many of their statements are tied up with personality disputes). That appears to be what Birnbaum is doing in erecting Conf to capture how he thinks the methods are used for “informative” scientific inference in practice.
Still, since Birnbaum’s (Conf) sounds to be alluding to pre-trial error probabilities, I regard (Conf) as too “behavioristic”. Some of his papers hint at the possibility that he would have wanted to use it in a (post-data) assessment of how well (or poorly) various claims were actually tested. (Aside from that he also leans to a focus on simple statistical hypotheses, though Conf need not be so restricted.)
I think that Fisher (1955) is essentially correct in maintaining that “When, therefore, Neyman denies the existence of inductive reasoning he is merely expressing a verbal preference”. It is a verbal preference one can also find in Popper’s view of corroboration. (He, and current day critical rationalists, also hold that all reasoning is deductive, and that probability arises to evaluate degrees of severity, well-testedness or corroboration, not inductive confirmation.) This blog may be searched for more on Popper and the rest….
*Thanks to all who attended! Feel free to write with questions: firstname.lastname@example.org
 Edwards, A. W. F., Nature, 222, 1233 (1969)
 Birnbaum, A., in Philosophy, Science and Method: Essays in Honor of Ernest Nagel (edited by Morgenbesser, S., Suppes, P., and While, M.) (St. Martin’s Press. NY,1969).
 Likelihood in International Encyclopedia of the Social Sciences (Crowell-Collier, NY, 1968).
Birnbaum, A. (1977). “The Neyman-Pearson theory as decision theory, and as inference theory; with a criticism of the Lindley-Savage argument for Bayesian theory”. Synthese 36 (1) : 19-49.