*Today is R.A. Fisher’s birthday. I’ll reblog some Fisherian items this week with a few new remarks. This paper comes just before the conflicts with Neyman and Pearson (N-P) erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see Fisher and N-P as ending up in a similar place while starting from different origins, as David Cox might say [1]. Unfortunately, the blow-up that occurred soon after is behind today’s misdirected war vs statistical significance tests.* I quote just the most relevant portions…the full article is linked below.** Happy Birthday Fisher!* Continue reading

# Posts Tagged With: Ronald Fisher

## Happy Birthday R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

## Happy Birthday R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

*Today is R.A. Fisher’s birthday. I will post some Fisherian items this week in recognition of it*. This paper comes just before the conflicts with Neyman and Pearson erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. We may see them as ending up in a similar place while starting from different origins. I quote just the most relevant portions…the full article is linked below. Happy Birthday Fisher!*

“Two New Properties of Mathematical Likelihood“

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H_{0} is more powerful than any other equivalent test, with regard to an alternative hypothesis H_{1}, when it rejects H_{0} in a set of samples having an assigned aggregate frequency ε when H_{0} is true, and the greatest possible aggregate frequency when H_{1} is true. If any group of samples can be found within the region of rejection whose probability of occurrence on the hypothesis H_{1} is less than that of any other group of samples outside the region, but is not less on the hypothesis H_{0}, then the test can evidently be made more powerful by substituting the one group for the other. Continue reading

## Happy Birthday R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

*Today is R.A. Fisher’s birthday. I’ll post some Fisherian items this week in honor of it. This paper comes just before the conflicts with Neyman and Pearson erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see them as ending up in a similar place while starting from different origins. I quote just the most relevant portions…the full article is linked below. Happy Birthday Fisher!*

“Two New Properties of Mathematical Likelihood“

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H_{0} is more powerful than any other equivalent test, with regard to an alternative hypothesis H_{1}, when it rejects H_{0} in a set of samples having an assigned aggregate frequency ε when H_{0} is true, and the greatest possible aggregate frequency when H_{1} is true. If any group of samples can be found within the region of rejection whose probability of occurrence on the hypothesis H_{1} is less than that of any other group of samples outside the region, but is not less on the hypothesis H_{0}, then the test can evidently be made more powerful by substituting the one group for the other. Continue reading

## Guest Blog: ARIS SPANOS: The Enduring Legacy of R. A. Fisher

**By Aris Spanos**

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most remarkable, but least recognized, achievement was to initiate the recasting of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

M_{θ}(**x**)={f(**x**;θ); θ∈Θ**}**; **x**∈R^{n };Θ⊂R^{m}; m < n; (1)

where the distribution of the sample f(**x**;θ) ‘encapsulates’ the probabilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand **x**_{0}:=(x_{1},x_{2},…,x_{n}) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pearson’s approach, proceeding from data **x**_{0 }in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed M_{θ}(**x**) (a ‘hypothetical inﬁnite population’), and view x_{0 }as a ‘typical’ realization thereof; see Spanos (1999). Continue reading

## R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

*Today is R.A. Fisher’s birthday. I’ll post some different Fisherian items this week in honor of it. This paper comes just before the conflicts with Neyman and Pearson erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see them as ending up in a similar place while starting from different origins. I quote just the most relevant portions…the full article is linked below. Happy Birthday Fisher!*

“Two New Properties of Mathematical Likelihood“

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H_{0} is more powerful than any other equivalent test, with regard to an alternative hypothesis H_{1}, when it rejects H_{0} in a set of samples having an assigned aggregate frequency ε when H_{0} is true, and the greatest possible aggregate frequency when H_{1} is true. Continue reading

## Stephen Senn: Fisher’s Alternative to the Alternative

**As part of the week of recognizing R.A.Fisher (February 17, 1890 – July 29, 1962), I reblog Senn from 3 years ago. **

*‘Fisher’s alternative to the alternative’*

*By: Stephen Senn*

[2012 marked] the 50th anniversary of RA Fisher’s death. It is a good excuse, I think, to draw attention to an aspect of his philosophy of significance testing. In his extremely interesting essay on Fisher, Jimmie Savage drew attention to a problem in Fisher’s approach to testing. In describing Fisher’s aversion to power functions Savage writes, ‘Fisher says that some tests are *more sensitive* than others, and I cannot help suspecting that that comes to very much the same thing as thinking about the power function.’ (Savage 1976) (P473).

The modern statistician, however, has an advantage here denied to Savage. Savage’s essay was published posthumously in 1976 and the lecture on which it was based was given in Detroit on 29 December 1971 (P441). At that time Fisher’s scientific correspondence did not form part of his available oeuvre but in 1990 Henry Bennett’s magnificent edition of Fisher’s statistical correspondence (Bennett 1990) was published and this throws light on many aspects of Fisher’s thought including on significance tests.

The key letter here is Fisher’s reply of 6 October 1938 to Chester Bliss’s letter of 13 September. Bliss himself had reported an issue that had been raised with him by Snedecor on 6 September. Snedecor had pointed out that an analysis using inverse sine transformations of some data that Bliss had worked on gave a different result to an analysis of the original values. Bliss had defended his (transformed) analysis on the grounds that a) if a transformation always gave the same result as an analysis of the original data there would be no point and b) an analysis on inverse sines was a sort of weighted analysis of percentages with the transformation more appropriately reflecting the weight of information in each sample. Bliss wanted to know what Fisher thought of his reply.

Fisher replies with a ‘shorter catechism’ on transformations which ends as follows: Continue reading

## R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’: Just before breaking up (with N-P)

*In recognition of R.A. Fisher’s birthday tomorrow, I will post several entries on him. I find this (1934) paper to be intriguing –immediately before the conflicts with Neyman and Pearson erupted. It represents essentially the last time he could take their work at face value, without the professional animosities that almost entirely caused, rather than being caused by, the apparent philosophical disagreements and name-calling everyone focuses on. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see them as ending up in a very similar place (no pun intended) while starting from different origins. I quote just the most relevant portions…the full article is linked below. I’d blogged it earlier here. You may find some gems in it.*

**‘Two new Properties of Mathematical Likelihood’**

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307(1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H_{0} is more powerful than any other equivalent test, with regard to an alternative hypothesis H_{1}, when it rejects H_{0} in a set of samples having an assigned aggregate frequency ε when H_{0} is true, and the greatest possible aggregate frequency when H_{1} is true.

If any group of samples can be found within the region of rejection whose probability of occurrence on the hypothesis H_{1} is less than that of any other group of samples outside the region, but is not less on the hypothesis H_{0}, then the test can evidently be made more powerful by substituting the one group for the other. Continue reading

## STEPHEN SENN: Fisher’s alternative to the alternative

*By: Stephen Senn*

This year [2012] marks the 50th anniversary of RA Fisher’s death. It is a good excuse, I think, to draw attention to an aspect of his philosophy of significance testing. In his extremely interesting essay on Fisher, Jimmie Savage drew attention to a problem in Fisher’s approach to testing. In describing Fisher’s aversion to power functions Savage writes, ‘Fisher says that some tests are *more sensitive* than others, and I cannot help suspecting that that comes to very much the same thing as thinking about the power function.’ (Savage 1976) (P473).

The modern statistician, however, has an advantage here denied to Savage. Savage’s essay was published posthumously in 1976 and the lecture on which it was based was given in Detroit on 29 December 1971 (P441). At that time Fisher’s scientific correspondence did not form part of his available oeuvre but in1990 Henry Bennett’s magnificent edition of Fisher’s statistical correspondence (Bennett 1990) was published and this throws light on many aspects of Fisher’s thought including on significance tests.

The key letter here is Fisher’s reply of 6 October 1938 to Chester Bliss’s letter of 13 September. Bliss himself had reported an issue that had been raised with him by Snedecor on 6 September. Snedecor had pointed out that an analysis using inverse sine transformations of some data that Bliss had worked on gave a different result to an analysis of the original values. Bliss had defended his (transformed) analysis on the grounds that a) if a transformation always gave the same result as an analysis of the original data there would be no point and b) an analysis on inverse sines was a sort of weighted analysis of percentages with the transformation more appropriately reflecting the weight of information in each sample. Bliss wanted to know what Fisher thought of his reply.

Fisher replies with a ‘shorter catechism’ on transformations which ends as follows: Continue reading

## R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

*Exactly 1 year ago: I find this to be an intriguing discussion–before some of the conflicts with N and P erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see them as ending up in a similar place while starting from different origins. I quote just the most relevant portions…the full article is linked below.*

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H_{0} is more powerful than any other equivalent test, with regard to an alternative hypothesis H_{1}, when it rejects H_{0} in a set of samples having an assigned aggregate frequency ε when H_{0} is true, and the greatest possible aggregate frequency when H_{1} is true.

If any group of samples can be found within the region of rejection whose probability of occurrence on the hypothesis H_{1} is less than that of any other group of samples outside the region, but is not less on the hypothesis H_{0}, then the test can evidently be made more powerful by substituting the one group for the other. Continue reading

## Aris Spanos: The Enduring Legacy of R. A. Fisher

More Fisher insights from A. Spanos, this from 2 years ago:

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most remarkable, but least recognized, achievement was to initiate the recasting of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

M_{θ}(**x**)={f(**x**;θ); θ∈Θ**}**; **x**∈R^{n };Θ⊂R^{m}; m < n; (1)

where the distribution of the sample f(**x**;θ) ‘encapsulates’ the probabilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand **x**_{0}:=(x_{1},x_{2},…,x_{n}). As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

Fisher was able to recast statistical inference by turning Karl Pearson’s approach, proceeding from data **x**_{0 }in search of a frequency curve f(x;ϑ) to describe its histogram, on its head. He proposed to begin with a prespeciﬁed M_{θ}(**x**) (a ‘hypothetical inﬁnite population’), and view x_{0 }as a ‘typical’ realization thereof; see Spanos (1999).

In my mind, Fisher’s most enduring contribution is his devising a general way to ‘operationalize’ errors by embedding the material experiment into M_{θ}(**x**), and taming errors via probabiliﬁcation, i.e. to deﬁne frequentist error probabilities in the context of a statistical model. These error probabilities are (a) deductively derived from the statistical model, and (b) provide a measure of the ‘eﬀectiviness’ of the inference procedure: how often a certain method will give rise to correct inferences concerning the underlying ‘true’ Data Generating Mechanism (DGM). This cast aside the need for a prior. Both of these key elements, the statistical model and the error probabilities, have been reﬁned and extended by Mayo’s error statistical approach (EGEK 1996). Learning from data is achieved when an inference is reached by an inductive procedure which, with high probability, will yield true conclusions from valid inductive premises (a statistical model); Mayo and Spanos (2011). Continue reading

## R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

*I find this to be an intriguing discussion–before some of the conflicts with N and P erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see them as ending up in a similar place while starting from different origins. I quote just the most relevant portions…the full article is linked below.*

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H_{0} is more powerful than any other equivalent test, with regard to an alternative hypothesis H_{1}, when it rejects H_{0} in a set of samples having an assigned aggregate frequency ε when H_{0} is true, and the greatest possible aggregate frequency when H_{1} is true.

If any group of samples can be found within the region of rejection whose probability of occurrence on the hypothesis H_{1} is less than that of any other group of samples outside the region, but is not less on the hypothesis H_{0}, then the test can evidently be made more powerful by substituting the one group for the other.

Consequently, for the most powerful test possible the ratio of the probabilities of occurrence on the hypothesis H_{0} to that on the hypothesis H_{1} is less in all samples in the region of rejection than in any sample outside it. For samples involving continuous variation the region of rejection will be bounded by contours for which this ratio is constant. The regions of rejection will then be required in which the likelihood of H_{0} bears to the likelihood of H_{1}, a ratio less than some fixed value defining the contour. (295)…

It is evident, at once, that such a system is only possible when the class of hypotheses considered involves only a single parameter θ, or, what come to the same thing, when all the parameters entering into the specification of the population are definite functions of one of their number. In this case, the regions defined by the uniformly most powerful test of significance are those defined by the estimate of maximum likelihood, T. For the test to be uniformly most powerful, moreover, these regions must be independent of θ showing that the statistic must be of the special type distinguished as sufficient. Such sufficient statistics have been shown to contain all the information which the sample provides relevant to the value of the appropriate parameter θ . It is inevitable therefore that if such a statistic exists it should uniquely define the contours best suited to discriminate among hypotheses differing only in respect of this parameter; and it is surprising that Neyman and Pearson should lay it down as a preliminary consideration that ‘the tesitng of statistical hypotheses cannot be treated as a problem in estimation.’ When tests are considered only in relation to sets of hypotheses specified by one or more variable parameters, the efficacy of the tests can be treated directly as the problem of estimation of these parameters. Regard for what has been established in that theory, apart from the light it throws on the results already obtained by their own interesting line of approach, should also aid in treating the difficulties inherent in cases in which no sufficient statistics exists. (296)

## Two New Properties of Mathematical Likelihood

*Note: I find this to be an intriguing, if perhaps little-known, discussion, long before the conflicts reflected in the three articles (the “triad”) below, Here Fisher links his tests to the Neyman and Pearson lemma in terms of power. I invite your deconstructions/comments.*

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

To Thomas Bayes must be given the credit of broaching the problem of using the concepts of mathematical probability in discussing problems of inductive inference, in which we argue from the particular to the general; or, in statistical phraselogy, argue from the sample to the population, from which, *ex hypothesi*, the sample was drawn. Bayes put forward, with considerable caution, a method by which such problems could be reduced to the form of problems of probability. His method of doing this depended essentially on postulating *a priori *knowledge, not of the particular population of which our observations form a sample, but of an imaginary population of populations from which this population was regarded as having been drawn at random. Clearly, if we have possession of such *a priori* knowledge, our problem is not properly an inductive one at all, for the population under discussion is then regarded merely as a particular case of a general type, of which we already possess exact knowledge, and are therefore in a position to draw exact deductive inferences.

## Guest Blogger. ARIS SPANOS: The Enduring Legacy of R. A. Fisher

**By Aris Spanos**

One of R. A. Fisher’s (17 February 1890 — 29 July 1962) most remarkable, but least recognized, achievement was to initiate the recasting of statistical induction. Fisher (1922) pioneered modern frequentist statistics as a model-based approach to statistical induction anchored on the notion of a statistical model, formalized by:

M_{θ}(**x**)={f(**x**;θ); θ∈Θ**}**; **x**∈R^{n };Θ⊂R^{m}; m < n; (1)

where the distribution of the sample f(**x**;θ) ‘encapsulates’ the probabilistic information in the statistical model.

Before Fisher, the notion of a statistical model was vague and often implicit, and its role was primarily conﬁned to the description of the distributional features of the data in hand using the histogram and the ﬁrst few sample moments; implicitly imposing random (IID) samples. The problem was that statisticians at the time would use descriptive summaries of the data to claim generality beyond the data in hand **x**_{0}:=(x_{1},x_{2},…,x_{n}) As late as the 1920s, the problem of statistical induction was understood by Karl Pearson in terms of invoking (i) the ‘stability’ of empirical results for subsequent samples and (ii) a prior distribution for θ.

## Guest Blogger. STEPHEN SENN: Fisher’s alternative to the alternative

This year marks the 50th anniversary of RA Fisher’s death. It is a good excuse, I think, to draw attention to an aspect of his philosophy of significance testing. In his extremely interesting essay on Fisher, Jimmie Savage drew attention to a problem in Fisher’s approach to testing. In describing Fisher’s aversion to power functions Savage writes, ‘Fisher says that some tests are *more sensitive* than others, and I cannot help suspecting that that comes to very much the same thing as thinking about the power function.’ (Savage 1976) (P473).

The modern statistician, however, has an advantage here denied to Savage. Savage’s essay was published posthumously in 1976 and the lecture on which it was based was given in Detroit on 29 December 1971 (P441). At that time Fisher’s scientific correspondence did not form part of his available oeuvre but in1990 Henry Bennett’s magnificent edition of Fisher’s statistical correspondence (Bennett 1990) was published and this throws light on many aspects of Fisher’s thought including on significance tests. Continue reading

## JERZY NEYMAN: Note on an Article by Sir Ronald Fisher

** **

*By Jerzy Neyman (1956)
*

**Summary**

(1) FISHER’S allegation that, contrary to some passages in the introduction and on the cover of the book by Wald, this book does not really deal with experimental design is unfounded. In actual fact, the book is permeated with problems of experimentation. (2) Without consideration of hypotheses alternative to the one under test and without the study of probabilities of the two kinds, no purely probabilistic theory of tests is possible. (3) The conceptual fallacy of the notion of fiducial distribution rests upon the lack of recognition that valid probability statements about random variables usually cease to be valid if the random variables are replaced by their particular values. The notorious multitude of “paradoxes” of fiducial theory is a consequence of this oversight. (4) The idea of a “cost function for faulty judgments” appears to be due to Laplace, followed by Gauss.

## R.A.FISHER: Statistical Methods and Scientific Inference

**In honor of R.A. Fisher’s birthday this week (Feb 17), in a year that will mark 50 years since his death, we will post the “Triad” exchange between Fisher, Pearson and Neyman, and other guest contributions* **

*by Sir Ronald Fisher (1955)
*

**SUMMARY**

The attempt to reinterpret the common tests of significance used in scientific research as though they constituted some kind of acceptance procedure and led to “decisions” in Wald’s sense, originated in several misapprehensions and has led, apparently, to several more.

The three phrases examined here, with a view to elucidating they fallacies they embody, are:

- “Repeated sampling from the same population”,
- Errors of the “second kind”,
- “Inductive behavior”.

Mathematicians without personal contact with the Natural Sciences have often been misled by such phrases. The errors to which they lead are not only numerical.

TO CONTINUE READING R. A. FISHER’S PAPER, CLICK HERE.

*If you wish to contribute something in connection to Fisher, send to error@vt.edu