The search for an agreement on numbers across different statistical philosophies is an understandable pastime in foundations of statistics. Perhaps identifying matching or unified numbers, apart from what they might mean, would offer a glimpse as to shared underlying goals? Jim Berger (2003) assures us there is no sacrilege in agreeing on methodology without philosophy, claiming “while the debate over interpretation can be strident, statistical practice is little affected as long as the reported numbers are the same” (Berger, 2003, p. 1).
Do readers agree?
Neyman and Pearson (or perhaps it was mostly Neyman) set out to determine when tests of statistical hypotheses may be considered “independent of probabilities a priori” ([p. 201). In such cases, frequentist and Bayesian may agree on a critical or rejection region.
The agreement between “default” Bayesians and frequentists in the case of one-sided Normal (IID) testing (known σ) is very familiar. As noted in Ghosh, Delampady, and Samanta (2006, p. 35), if we wish to reject a null value when “the posterior odds against it are 19:1 or more, i.e., if posterior probability of H0 is < .05” then the rejection region matches that of the corresponding test of H0, (at the .05 level) if that were the null hypothesis. By contrast, they go on to note the also familiar fact that there would be disagreement between the frequentist and Bayesian if one were instead testing the two sided: H0: μ=μ0 vs. H1: μ≠μ0 with known σ. In fact, the same outcome that would be regarded as evidence against the null in the one-sided test (for the default Bayesian and frequentist) can result in statistically significant results being construed as no evidence against the null —for the Bayesian-- or even evidence for it (due to a spiked prior).[i] Read more
By: Stephen Senn
This year marks the 50th anniversary of RA Fisher’s death. It is a good excuse, I think, to draw attention to an aspect of his philosophy of significance testing. In his extremely interesting essay on Fisher, Jimmie Savage drew attention to a problem in Fisher’s approach to testing. In describing Fisher’s aversion to power functions Savage writes, ‘Fisher says that some tests are more sensitive than others, and I cannot help suspecting that that comes to very much the same thing as thinking about the power function.’ (Savage 1976) (P473).
The modern statistician, however, has an advantage here denied to Savage. Savage’s essay was published posthumously in 1976 and the lecture on which it was based was given in Detroit on 29 December 1971 (P441). At that time Fisher’s scientific correspondence did not form part of his available oeuvre but in1990 Henry Bennett’s magnificent edition of Fisher’s statistical correspondence (Bennett 1990) was published and this throws light on many aspects of Fisher’s thought including on significance tests. Read more
by E.S. PEARSON (1955)
SUMMARY: This paper contains a reply to some criticisms made by Sir Ronald Fisher in his recent article on “Scientific Methods and Scientific Induction”.
Controversies in the field of mathematical statistics seem largely to have arisen because statisticians have been unable to agree upon how theory is to provide, in terms of probability statements, the numerical measures most helpful to those who have to draw conclusions from observational data. We are concerned here with the ways in which mathematical theory may be put, as it were, into gear with the common processes of rational thought, and there seems no reason to suppose that there is one best way in which this can be done. If, therefore, Sir Ronald Fisher recapitulates and enlarges on his views upon statistical methods and scientific induction we can all only be grateful, but when he takes this opportunity to criticize the work of others through misapprehension of their views as he has done in his recent contribution to this Journal (Fisher 1955), it is impossible to leave him altogether unanswered.
In honor of R.A. Fisher’s birthday this week (Feb 17), in a year that will mark 50 years since his death, we will post the “Triad” exchange between Fisher, Pearson and Neyman, and other guest contributions*
by Sir Ronald Fisher (1955)
The attempt to reinterpret the common tests of significance used in scientific research as though they constituted some kind of acceptance procedure and led to “decisions” in Wald’s sense, originated in several misapprehensions and has led, apparently, to several more.
The three phrases examined here, with a view to elucidating they fallacies they embody, are:
- “Repeated sampling from the same population”,
- Errors of the “second kind”,
- “Inductive behavior”.
Mathematicians without personal contact with the Natural Sciences have often been misled by such phrases. The errors to which they lead are not only numerical.
TO CONTINUE READING R. A. FISHER’S PAPER, CLICK HERE.
*If you wish to contribute something in connection to Fisher, send to email@example.com