This continues my previous post: “Can’t take the fiducial out of Fisher…” in recognition of Fisher’s birthday, February 17. These 2 posts reflect my working out of these ideas in writing Section 5.8 of *Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars* (SIST, CUP 2018).* *Here’s all of Section 5.8 (“Neyman’s Performance and Fisher’s Fiducial Probability”) for your Saturday night reading.*

Move up 20 years to the famous 1955/56 exchange between Fisher and Neyman. Fisher clearly connects Neyman’s adoption of a behavioristic-performance formulation to his denying the soundness of fiducial inference. When “Neyman denies the existence of inductive reasoning, he is merely expressing a verbal preference. For him ‘reasoning’ means what ‘deductive reasoning’ means to others.” (Fisher 1955, p. 74).

Fisher was right that Neyman’s calling the outputs of statistical inferences “actions” merely expressed Neyman’s preferred way of talking. Nothing earth-shaking turns on the choice to dub every inference “an act of making an inference”.[i] The “rationality” or “merit” goes into the rule. Neyman, much like Popper, had a good reason for drawing a bright red line between his use of probability (for corroboration or probativeness) and its use by ‘probabilists’ (who assign probability to hypotheses). Fisher’s Fiducial probability was in danger of blurring this very distinction. Popper said, and Neyman would have agreed, that he had no problem with our using the word induction so long it was kept clear it meant testing hypotheses severely.

In Fisher’s next few sentences, things get very interesting. In reinforcing his choice of language, Fisher continues, Neyman “seems to claim that the statement (a) “μ has a probability of 5 per cent. of exceeding M” is a different statement from (b) “M has a probability of 5 per cent. of falling short of μ”. There’s no problem about equating these two so long as M is a random variable. But watch what happens in the next sentence. [I’m using M rather than *X *; Fisher’s paper uses lower case *x* in the following, though clearly he means *X *in [1].] According to Fisher,

Neyman violates ‘the principles of deductive logic [by accepting a] statement such as

[1] Pr{(M– ts) < μ <(M+ ts)} = α,as rigorously demonstrated, and yet, when numerical values are available for the statistics M

and s, so that on substitution of these and use of the 5 per cent. value of t, the statement would read

[2] Pr{92.99 < μ < 93.01} = 95 per cent.,to deny to this numerical statement any validity. This evidently is to deny the syllogistic process of making a substitution in the major premise of terms which the minor premise establishes as equivalent (Fisher 1955, p. 75).

But the move from (1) to (2) is fallacious! Could Fisher really be commiting this fallacious probabilistic instantiation? I.J. Good (1971) describes how many felt, and often still feel:

…if we do not examine the fiducial argument carefully, it seems almost inconceivable that Fisher should have made the error which he did in fact make. It is because (i) it seemed so unlikely that a man of his stature should persist in the error, and (ii) because he modestly says(…[1959], p. 54) his 1930 explanation left a good deal to be desired’, that so many people assumed for so long that the argument was correct. They lacked the

daringto question it.

In responding to Fisher, Neyman (1956, p.292) declares himself at his wit’s end in trying to find a way to convince Fisher of the inconsistencies in moving from (1) to (2).

When these explanations did not suffice to convince Sir Ronald of his mistake, I was tempted to give up. However, in a private conversation David Blackwell suggested that Fisher’s misapprehension may be cleared up by the examination of several simple examples. They illustrate the general rule that valid probability statements regarding relations involving random variables may cease and usually do cease to be valid if random variable are replaced by their observed particular values.(p. 292)[ii]

“Thus if

is a normal random variable with mean zero and an arbitrary variance greater than zero, we may agree” [that Pr(X< 0)= .5 But observing, sayX= 1.7 yields Pr(1.7< 0) = .5, which is clearly illicit]. “It is doubtful whether the chaos and confusion now reigning in the field of fiducial argument were ever equaled in any other doctrine. The source of this confusion is the lack of realization that equation (1) does not imply (2)” (Neyman 1956).X

For decades scholars have tried to figure out what Fisher might have meant, and while the matter remains unsettled, this much is agreed: The instantiation that Fisher is yelling about 20 years after the creation of N-P tests and the break with Neyman, is fallacious. Fiducial probabilities can only properly attach to the method. Keeping to “performance” language, is a sure way to avoid the illicit slide from (1) to (2). Once the intimate tie-ins with Fisher’s fiducial argument is recognized, the rhetoric of the Neyman-Fisher dispute takes on a completely new meaning. When Fisher says “Neyman only cares for acceptance sampling contexts” as he does after around 1950, he’s really saying Neyman thinks fiducial inference is contradictory unless it’s viewed in terms of properties of the method in (actual or hypothetical) repetitions. The fact that Neyman (with the contributions of Wald, and later Robbins) went overboard in his behaviorism [iii], to the extent that even Egon wanted to divorce him—ending his 1955 reply to Fisher with the claim that inductive behavior was “Neyman’s field rather than mine”—is a different matter.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[i] Fisher also commonly spoke of the output of tests as actions. Neyman rightly says that he is only following Fisher. As the years went by, Fisher comes to renounce things he himself had said earlier in the midst of polemics against Neyman.

[ii] But surely this is the kind of simple example that would have been brought forward right off the bat, before the more elaborate, infamous cases (Fisher-Behrens). Did Fisher ever say “oh now I see my mistake” as a result of these simple examples? Not to my knowledge. So I find this statement of Neyman’s about the private conversation with Blackwell a little curious. Anyone know more about it?

[iii]At least in his theory, but not not in his practice. A relevant post is “distinguishing tests of statistical hypotheses and tests of significance might have been a lapse of someone’s pen“.

Fisher, R.A. (1955). “Statistical Methods and Scientific Induction”.

Good, I.J. (1971b), In reply to comments on his “The probabilistic explication of information, evidence, srprise, causality, explanation and utility’. In Godambe and Sprott (1971).

Neyman, J. (1956). “Note on an Article by Sir Ronald Fisher”.

Pearson, E.S. (1955). “Statistical concepts in Their Relation to Reality“.

____________________________

*Earlier excerpts and mementos from SIST up to Dec 31, 20018 are here.

Jan 10, 2019 Excerpt from SIST is here.

Jan 13, 2019 Mementos from SIST (Excursion 4) are here. These are summaries of all 4 tours.