# Aris Spanos: The Enduring Legacy of R. A. Fisher More Fisher insights from A. Spanos, this from 2 years ago:

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most re­markable, but least recognized, achievement was to initiate the recast­ing of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

Mθ(x)={f(x;θ); θ∈Θ}; x∈Rn ;Θ⊂Rm; m < n; (1)

where the distribution of the sample f(x;θ) ‘encapsulates’ the proba­bilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand x0:=(x1,x2,…,xn). As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pear­son’s approach, proceeding from data x0 in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed Mθ(x) (a ‘hypothetical inﬁnite population’), and view x0 as a ‘typical’ realization thereof; see Spanos (1999).

In my mind, Fisher’s most enduring contribution is his devising a general way to ‘operationalize’ errors by embedding the material ex­periment into Mθ(x), and taming errors via probabiliﬁcation, i.e. to deﬁne frequentist error probabilities in the context of a statistical model. These error probabilities are (a) deductively derived from the statistical model, and (b) provide a measure of the ‘eﬀectiviness’ of the inference procedure: how often a certain method will give rise to correct in­ferences concerning the underlying ‘true’ Data Generating Mechanism (DGM). This cast aside the need for a prior. Both of these key elements, the statistical model and the error probabilities, have been reﬁned and extended by Mayo’s error statistical approach (EGEK 1996). Learning from data is achieved when an inference is reached by an inductive procedure which, with high probability, will yield true conclusions from valid inductive premises (a statistical model); Mayo and Spanos (2011).

Frequentist statistical inference was largely in place by the late 1930s. Fisher, almost single-handedly, created the current theory of ‘optimal’ point estimation and formalized signiﬁcance testing based on the p-value reasoning. In the early 1930s Neyman and Pearson (N-P) proposed an ‘optimal’ theory for hypothesis testing, by modify­ing/extending Fisher’s signiﬁcance testing. By the late 1930s Neyman proposed an ‘optimal’ theory for interval estimation analogous to N-P testing. Despite these developments in frequentist statstics, its philo­sophical foundations concerned with the proper form of the underlying inductive reasoning were in a confused state. Fisher was arguing for ‘inductive inference’, spearheaded by his signiﬁcance testing in conjunc­tion with p-values and his ﬁducial probability for interval estimation. Neyman was arguing for ‘inductive behavior’ based on N-P testing and conﬁdence interval estimation ﬁrmly grounded on pre-data error prob­abilities.

The last exchange between these pioneers took place in the mid 1950s (see [Fisher, 1955; Neyman, 1956; Pearson, 1955]) and left the philosophical foundations of the ﬁeld in a state of confusion with many more questions than answers.

One of the key issues of disagreement was about the relevance of alternative hypotheses and the role of the pre-data error probabilities in frequentist testing, i.e. the irrelevance of Errors of the “second kind”, as Fisher (p. 69) framed the issue. My take on this issue is that Fisher did understand the importance of alternative hypotheses and the power of the test by talking about its ‘sensitivity’:

“By increasing the size of the experiment, we can render it more sensi­tive, meaning by this that it will allow of the detection of a lower degree of sensory discrimination, or, in other words, of a quantitatively smaller departure from the null hypothesis.” (Fisher, 1935, p. 22)

If this is not the same as increasing the power of the test by increas­ing the sample size, I do not know what it is! What Fisher and many subsequent commentators did not appreciate enough was that Neyman and Pearson deﬁned the relevant alternative hypotheses in a very spe­cific way: to be the complement to the null relative to the prespeciﬁed statistical model Mθ(x):

H0: µ∈Θ0 vs. H1: µ∈Θ1 (2)

where Θ0 and Θ1 constitute a partition of the parameter space Θ. That rendered the evaluation of power possible and Fisher’s comment about type II errors:

“Such errors are therefore incalculable both in frequency and in magni­tude merely from the speciﬁcation of the null hypothesis.” simply misplaced.

Let me ﬁnish with a quotation from Fisher (1935) that I ﬁnd very insightful and as relevant today as it was then:

“In the ﬁeld of pure research no assessment of the cost of wrong con­clusions, or of delay in arriving at more correct conclusions can conceivably be more than a pretence, and in any case such an assessment would be inadmissible and irrelevant in judging the state of the scientiﬁc evidence.” (pp. 25-26)

References

 Fisher, R. A. (1922), “On the mathematical foundations of theoret­ical statistics”, Philosophical Transactions of the Royal Society A,

222: 309-368.

 Fisher, R. A. (1935), The Design of Experiments, Oliver and Boyd, Edinburgh.

 Fisher, R. A. (1955), “Statistical methods and scientiﬁc induction,” Journal of the Royal Statistical Society, B, 17: 69-78.

 Mayo, D. G. and A. Spanos (2011), “Error Statistics,” pp. 151­196 in the Handbook of Philosophy of Science, vol. 7: Philosophy of Statistics, D. Gabbay, P. Thagard, and J. Woods (editors), Elsevier.

 Neyman, J. (1956), “Note on an Article by Sir Ronald Fisher,” Journal of the Royal Statistical Society, B, 18: 288-294.

 Pearson, E. S. (1955), “Statistical Concepts in the Relation to Real­ity,Journal of the Royal Statistical Society, B, 17, 204-207.

 Spanos, A. (1999), Probability Theory and Statistical Inference: Econometric Modeling with Observational Data, Cambridge Uni­versity Press, Cambridge.

Categories: Fisher, phil/history of stat, Statistics | | 2 Comments

1. 