I continue a week of Fisherian posts begun on his birthday (Feb 17). This is his contribution to the “Triad”–an exchange between Fisher, Neyman and Pearson 20 years after the Fisher-Neyman break-up. The other two are below. They are each very short and are worth your rereading.
“Statistical Methods and Scientific Induction”
by Sir Ronald Fisher (1955)
The attempt to reinterpret the common tests of significance used in scientific research as though they constituted some kind of acceptance procedure and led to “decisions” in Wald’s sense, originated in several misapprehensions and has led, apparently, to several more.
The three phrases examined here, with a view to elucidating they fallacies they embody, are:
- “Repeated sampling from the same population”,
- Errors of the “second kind”,
- “Inductive behavior”.
Mathematicians without personal contact with the Natural Sciences have often been misled by such phrases. The errors to which they lead are not only numerical.
To continue reading Fisher’s paper.
“Note on an Article by Sir Ronald Fisher“
by Jerzy Neyman (1956)
(1) FISHER’S allegation that, contrary to some passages in the introduction and on the cover of the book by Wald, this book does not really deal with experimental design is unfounded. In actual fact, the book is permeated with problems of experimentation. (2) Without consideration of hypotheses alternative to the one under test and without the study of probabilities of the two kinds, no purely probabilistic theory of tests is possible. (3) The conceptual fallacy of the notion of fiducial distribution rests upon the lack of recognition that valid probability statements about random variables usually cease to be valid if the random variables are replaced by their particular values. The notorious multitude of “paradoxes” of fiducial theory is a consequence of this oversight. (4) The idea of a “cost function for faulty judgments” appears to be due to Laplace, followed by Gauss.
“Statistical Concepts in Their Relation to Reality“.
by E.S. Pearson (1955)
Controversies in the field of mathematical statistics seem largely to have arisen because statisticians have been unable to agree upon how theory is to provide, in terms of probability statements, the numerical measures most helpful to those who have to draw conclusions from observational data. We are concerned here with the ways in which mathematical theory may be put, as it were, into gear with the common processes of rational thought, and there seems no reason to suppose that there is one best way in which this can be done. If, therefore, Sir Ronald Fisher recapitulates and enlarges on his views upon statistical methods and scientific induction we can all only be grateful, but when he takes this opportunity to criticize the work of others through misapprehension of their views as he has done in his recent contribution to this Journal (Fisher 1955 “Scientific Methods and Scientific Induction” ), it is impossible to leave him altogether unanswered.
In the first place it seems unfortunate that much of Fisher’s criticism of Neyman and Pearson’s approach to the testing of statistical hypotheses should be built upon a “penetrating observation” ascribed to Professor G.A. Barnard, the assumption involved in which happens to be historically incorrect. There was no question of a difference in point of view having “originated” when Neyman “reinterpreted” Fisher’s early work on tests of significance “in terms of that technological and commercial apparatus which is known as an acceptance procedure”. There was no sudden descent upon British soil of Russian ideas regarding the function of science in relation to technology and to five-year plans. It was really much simpler–or worse. The original heresy, as we shall see, was a Pearson one!…
To continue reading, “Statistical Concepts in Their Relation to Reality” click HERE
I used to think that this triad basically supplied all you needed to know, or most of it, about the philosophical and foundational disputes between these characters. I no longer do*. In fact, Fisher’s paper, I now think, is so misleading that I was hesitant to even post it. It reaffirms the mythical history which, while corresponding at a very superficial level to what is actually going on, reinforces the misleading picture, held almost everywhere, that Fisherian statistics is incompatible with N-P (or at least N) statistics. I call this position “incompatibilism”. The intriguing thing is that this matter is scarcely of merely historical interest. Amazingly enough, it’s directly connected to the confusions about statistical significance tests and cognate tools based on error probabilities of methods. I’ve blogged a lot about this over the past couple of years. Interested readers can search.
*Let me qualify this: If you already have a deep understanding of exactly why the mythical history is wrong, coupled with a reasonably good understanding of the statistical tools, then the triad actually does encompass the highpoints of the landscape of N-F debates on statistical foundations. Only then one has to read these pieces ironically.
I will share some points on Neyman’s contribution that I missed, or didn’t recognize the importance of, in decades of reading Neyman 1956.
The first I discovered, with Aris Spanos, around 2005, concerns the 3 roles for power on p. 290. I’d missed the third role until I found him discussing it in two other papers that we started to call Neyman’s hidden papers. (You can find links on this blog). The third role uses power post data to ascertain whether and when a failure to reject a null hypothesis counts as evidence “confirming” that the discrepancy being tested is less than some value. He brings it up in criticizing the philosopher Carnap It’s akin to power analysis, but can also be seen to underwrite a post-data severity analysis.
The second, the importance of which I only found around a year ago, is the reference to Bartlett on p. 292 on fiducial. It was a paper by Sandy Zabell that led me to go back and reread Bartlett. It turns out Spanos already knew all this. But I was always disregarding fiducial inference, as so many do, given its conundrums, and the fallacious instantiation Neyman discusses on this same page. Bartlett had shown Fisher’s fiducial probability didn’t have repeated sampling properties, so then Fisher starts denying that he ever wanted them, and rewrites some sentences from older works. Interested readers will find some recent posts looking up fiducial.
As for Egon’s wonderful piece, the only new revelation that came to me a year or so ago, upon reading Pearson’s book on Student, has to do with his sitting on a gate, pondering how to justify tests: he was overlooking apples and not blackcurrants. For the post discussing this see. https://errorstatistics.com/2016/08/18/history-of-statistics-sleuths-out-there-ideas-came-into-my-head-as-i-sat-on-a-gate-overlooking-an-experimental-blackcurrant-plot-no-wait-it-was-apples-probably. But I also have a theory about what Egon unconsciously means when he speaks of being “suddenly smitten” with doubt while sitting on that gate. My theory is that he’s suddenly smitten with the woman his cousin (who ran the apple orchard) was due to marry, and she fell for him to. When he and Neyman proved the N-P lemma, Egon finally felt bold enough to declare his love.