phil/history of stat

A recent “brown bag” I gave in Philo at Va Tech: “What is the Philosophy of Statistics? (and how I was drawn to it)”

Posted on May 8, 2025 by Mayo

I gave a talk last week as part of the VT Department of Philosophy’s “brown bag” series. Here’s the blurb:

What is the Philosophy of Statistics? (and how I was drawn to it)

I give an introductory discussion of two key philosophical controversies in statistics in relation to today’s “replication crisis” in science: the role of probability, and the nature of evidence, in error-prone inference. I begin with a simple principle: We don’t have evidence for a claim C if little, if anything, has been done that would have found C false (or specifically flawed), even if it is. Along the way, I sprinkle in some autobiographical reflections.

My slides are at the end of this post: Continue reading →

Categories: 2 way street: Stat & Phil of Sci, phil/history of stat, significance tests, stopping rule | Leave a comment

Happy Birthday R.A. Fisher: “Statistical methods and Scientific Induction” with replies by Neyman and E.S. Pearson

Posted on February 17, 2024 by Mayo

17 Feb 1890-29 July 1962

Today is R.A. Fisher’s birthday! I am reblogging what I call the “Triad”–an exchange between Fisher, Neyman and Pearson (N-P) published 20 years after the Fisher-Neyman break-up. While my favorite is still the reply by E.S. Pearson, which alone should have shattered Fisher’s allegations that N-P “reinterpret” tests of significance as “some kind of acceptance procedure”, all three are chock full of gems for different reasons. They are short and worth rereading. Neyman’s article pulls back the cover on what is really behind Fisher’s over-the-top polemics, what with Russian 5-year plans and commercialism in the U.S. Not only is Fisher jealous that N-P tests came to overshadow “his” tests, he is furious at Neyman for driving home the fact that Fisher’s fiducial approach had been shown to be inconsistent (by others). The flaw is illustrated by Neyman in his portion of the triad. Details may be found in my book, SIST (2018) especially pp 388-392 linked to here. It speaks to a common fallacy seen every day in interpreting confidence intervals. As for Neyman’s “behaviorism”, Pearson’s last sentence is revealing.

HAPPY BIRTHDAY R.A. FISHER! Continue reading →

Categories: E.S. Pearson, Fisher, Neyman, phil/history of stat | 1 Comment

Happy Birthday R.A. Fisher: “Statistical methods and Scientific Induction” with replies by Neyman and E.S. Pearson

Posted on February 17, 2023 by Mayo

17 Feb 1890-29 July 1962

Today is R.A. Fisher’s birthday! I am reblogging what I call the “Triad”–an exchange between Fisher, Neyman and Pearson (N-P) published 20 years after the Fisher-Neyman break-up. My seminar on PhilStat is studying these this week, so it’s timely. While my favorite is still the reply by E.S. Pearson, which alone should have shattered Fisher’s allegations that N-P “reinterpret” tests of significance as “some kind of acceptance procedure”, all three are chock full of gems for different reasons. They are short and worth rereading. Neyman’s article pulls back the cover on what is really behind Fisher’s over-the-top polemics, what with Russian 5-year plans and commercialism in the U.S. Not only is Fisher jealous that N-P tests came to overshadow “his” tests, he is furious at Neyman for driving home the fact that Fisher’s fiducial approach had been shown to be inconsistent (by others). The flaw is illustrated by Neyman in his portion of the triad. I discuss this briefly in my Philosophy of Science Association paper from a few months ago (slides are here*).Further details may be found in my book, SIST (2018) especially pp 388-392 linked to here. It speaks to a common fallacy seen every day in interpreting confidence intervals. As for Neyman’s “behaviorism”, Pearson’s last sentence is revealing.

HAPPY BIRTHDAY R.A. FISHER! Continue reading →

Categories: E.S. Pearson, Fisher, Neyman, phil/history of stat | Leave a comment

Statistical Concepts in Their Relation to Reality–E.S. Pearson

Posted on August 17, 2022 by Mayo

11 August 1895 – 12 June 1980

This is my third and final post marking Egon Pearson’s birthday (Aug. 11). The focus is his little-known paper: “Statistical Concepts in Their Relation to Reality” (Pearson 1955). I’ve linked to it several times over the years, but always find a new gem or two, despite its being so short. E. Pearson rejected some of the familiar tenets that have come to be associated with Neyman and Pearson (N-P) statistical tests, notably the idea that the essential justification for tests resides in a repeated applications or long-run control of rates of erroneous interpretations–what he termed the “behavioral” rationale of tests. In an unpublished letter E. Pearson wrote to Birnbaum (1974), he talks about N-P theory admitting of two interpretations: behavioral and evidential:

“I think you will pick up here and there in my own papers signs of evidentiality, and you can say now that we or I should have stated clearly the difference between the behavioral and evidential interpretations. Certainly we have suffered since in the way the people have concentrated (to an absurd extent often) on behavioral interpretations”.

(Nowadays, it might be said that some people concentrate to an absurd extent on “science-wise error rates” in their view of statistical tests as dichotomous screening devices.) Continue reading →

Categories: Egon Pearson, phil/history of stat, Philosophy of Statistics | Tags: E S Pearson, Egon Pearson, Statistical hypothesis testing | 1 Comment

R.A. Fisher: “Statistical methods and Scientific Induction” with replies by Neyman and E.S. Pearson

Posted on February 18, 2022 by Mayo

17 Feb 1890-29 July 1962

In recognition of Fisher’s birthday (Feb 17), I reblog what I call the “Triad”–an exchange between Fisher, Neyman and Pearson (N-P) a full 20 years after the Fisher-Neyman break-up–adding a few new introductory remarks here. While my favorite is still the reply by E.S. Pearson, which alone should have shattered Fisher’s allegations that N-P “reinterpret” tests of significance as “some kind of acceptance procedure”, they are all chock full of gems for different reasons. They are short and worth rereading. Neyman’s article pulls back the cover on what is really behind Fisher’s over-the-top polemics, what with Russian 5-year plans and commercialism in the U.S. Not only is Fisher jealous that N-P tests came to overshadow “his” tests, he is furious at Neyman for driving home the fact that Fisher’s fiducial approach had been shown to be inconsistent (by others). The flaw is glaring and is illustrated very simply by Neyman in his portion of the triad. Further details may be found in my book, SIST (2018) especially pp 388-392 linked to here. It speaks to a common fallacy seen every day in interpreting confidence intervals. As for Neyman’s “behaviorism”, Pearson’s last sentence is revealing. Continue reading →

Categories: E.S. Pearson, Fisher, Neyman, phil/history of stat | Leave a comment

Happy Birthday R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

Posted on February 17, 2022 by Mayo

17 February 1890–29 July 1962

Today is R.A. Fisher’s birthday. I’ll reblog some Fisherian items this week with a few new remarks. This paper comes just before the conflicts with Neyman and Pearson (N-P) erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. It’s as if we may see Fisher and N-P as ending up in a similar place while starting from different origins, as David Cox might say [1]. Unfortunately, the blow-up that occurred soon after is behind today’s misdirected war vs statistical significance tests.* I quote just the most relevant portions…the full article is linked below.** Happy Birthday Fisher! Continue reading →

Categories: Fisher, phil/history of stat | Tags: Bayesianism, induction, Ronald Fisher, significance tests | Leave a comment

R.A. Fisher: “Statistical methods and Scientific Induction” with replies by Neyman and E.S. Pearson

Posted on February 21, 2021 by Mayo

In Recognition of Fisher’s birthday (Feb 17), I reblog his contribution to the “Triad”–an exchange between Fisher, Neyman and Pearson 20 years after the Fisher-Neyman break-up. The other two are below. My favorite is the reply by E.S. Pearson, but all are chock full of gems for different reasons. They are each very short and are worth your rereading. Continue reading →

Categories: E.S. Pearson, Fisher, Neyman, phil/history of stat | Leave a comment

R. A. Fisher: How an Outsider Revolutionized Statistics (Aris Spanos)

Posted on February 18, 2021 by Mayo

This is a belated birthday post for R.A. Fisher (17 February, 1890-29 July, 1962)–it’s a guest post from earlier on this blog by Aris Spanos that has gotten the highest number of hits over the years.

Happy belated birthday to R.A. Fisher!

‘R. A. Fisher: How an Outsider Revolutionized Statistics’

by Aris Spanos

Few statisticians will dispute that R. A. Fisher (February 17, 1890 – July 29, 1962) is the father of modern statistics; see Savage (1976), Rao (1992). Inspired by William Gosset’s (1908) paper on the Student’s t finite sampling distribution, he recast statistics into the modern model-based induction in a series of papers in the early 1920s. He put forward a theory of optimal estimation based on the method of maximum likelihood that has changed only marginally over the last century. His significance testing, spearheaded by the p-value, provided the basis for the Neyman-Pearson theory of optimal testing in the early 1930s. According to Hald (1998) Continue reading →

Categories: Fisher, phil/history of stat, Spanos | 2 Comments

G.A. Barnard’s 105th Birthday: The Bayesian “catch-all” factor: probability vs likelihood

Posted on September 24, 2020 by Mayo

G. A. Barnard: 23 Sept 1915-30 July, 2002

Yesterday was statistician George Barnard’s 105th birthday. To acknowledge it, I reblog an exchange between Barnard, Savage (and others) on likelihood vs probability. The exchange is from pp 79-84 (of what I call) “The Savage Forum” (Savage, 1962).[i] A portion appears on p. 420 of my Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (2018, CUP). Six other posts on Barnard are linked below, including 2 guest posts, (Senn, Spanos); a play (pertaining to our first meeting), and a letter Barnard wrote to me in 1999. Continue reading →

Categories: Barnard, phil/history of stat, Statistics | 10 Comments

R. A. Fisher: How an Outsider Revolutionized Statistics (Aris Spanos)

Posted on February 21, 2020 by Mayo

This is a belated birthday post for R.A. Fisher (17 February, 1890-29 July, 1962)–it’s a guest post from earlier on this blog by Aris Spanos.

Happy belated birthday to R.A. Fisher!

‘R. A. Fisher: How an Outsider Revolutionized Statistics’

by Aris Spanos

Categories: Fisher, phil/history of stat, Spanos | 2 Comments

My paper, “P values on Trial” is out in Harvard Data Science Review

Posted on February 1, 2020 by Mayo

My new paper, “P Values on Trial: Selective Reporting of (Best Practice Guides Against) Selective Reporting” is out in Harvard Data Science Review (HDSR). HDSR describes itself as a A Microscopic, Telescopic, and Kaleidoscopic View of Data Science. The editor-in-chief is Xiao-li Meng, a statistician at Harvard. He writes a short blurb on each article in his opening editorial of the issue. Continue reading →

Categories: multiple testing, P-values, significance tests, Statistics | 29 Comments

Posts of Christmas Past (1): 13 howlers of significance tests (and how to avoid them)

Posted on December 24, 2019 by Mayo

I’m reblogging a post from Christmas past–exactly 7 years ago. Guess what I gave as the number 1 (of 13) ~~howler~~ well-worn criticism of statistical significance tests, haunting us back in 2012–all of which are put to rest in Mayo and Spanos 2011? Yes, it’s the frightening allegation that statistical significance tests forbid using any background knowledge! The researcher is imagined to start with a “blank slate” in each inquiry (no memories of fallacies past), and then unthinkingly apply a purely formal, automatic, accept-reject machine. What’s newly frightening (in 2019) is the credulity with which this apparition is now being met (by some). I make some new remarks below the post from Christmas past: Continue reading →

Categories: memory lane, significance tests, Statistics | Tags: criticism of frequentist methods | Leave a comment

Statistical Concepts in Their Relation to Reality–E.S. Pearson

Posted on August 15, 2019 by Mayo

11 August 1895 – 12 June 1980

In marking Egon Pearson’s birthday (Aug. 11), I’ll post some Pearson items this week. They will contain some new reflections on older Pearson posts on this blog. Today, I’m posting “Statistical Concepts in Their Relation to Reality” (Pearson 1955). I’ve linked to it several times over the years, but always find a new gem or two, despite its being so short. E. Pearson rejected some of the familiar tenets that have come to be associated with Neyman and Pearson (N-P) statistical tests, notably the idea that the essential justification for tests resides in a long-run control of rates of erroneous interpretations–what he termed the “behavioral” rationale of tests. In an unpublished letter E. Pearson wrote to Birnbaum (1974), he talks about N-P theory admitting of two interpretations: behavioral and evidential:

“I think you will pick up here and there in my own papers signs of evidentiality, and you can say now that we or I should have stated clearly the difference between the behavioral and evidential interpretations. Certainly we have suffered since in the way the people have concentrated (to an absurd extent often) on behavioral interpretations”.

Categories: Egon Pearson, phil/history of stat, Philosophy of Statistics | Tags: E S Pearson, Egon Pearson, Statistical hypothesis testing | Leave a comment

Guest Blog: R. A. Fisher: How an Outsider Revolutionized Statistics (Aris Spanos)

Posted on February 20, 2019 by Mayo

In recognition of R.A. Fisher’s birthday on February 17…a week of Fisher posts!

‘R. A. Fisher: How an Outsider Revolutionized Statistics’

by Aris Spanos

“Fisher was a genius who almost single-handedly created the foundations for modern statistical science, without detailed study of his predecessors. When young he was ignorant not only of the Continental contributions but even of contemporary publications in English.” (p. 738)

What is not so well known is that Fisher was the ultimate outsider when he brought about this change of paradigms in statistical science. As an undergraduate, he studied mathematics at Cambridge, and then did graduate work in statistical mechanics and quantum theory. His meager knowledge of statistics came from his study of astronomy; see Box (1978). That, however did not stop him from publishing his first paper in statistics in 1912 (still an undergraduate) on “curve fitting”, questioning Karl Pearson’s method of moments and proposing a new method that was eventually to become the likelihood method in his 1921 paper. Continue reading →

Categories: Fisher, phil/history of stat, Phil6334/ Econ 6614, Spanos, Statistics | 2 Comments

R.A. Fisher: “Statistical methods and Scientific Induction”

Posted on February 19, 2019 by Mayo

17 February 1890 — 29 July 1962

“Statistical Methods and Scientific Induction“

by Sir Ronald Fisher (1955)

SUMMARY

The attempt to reinterpret the common tests of significance used in scientific research as though they constituted some kind of acceptance procedure and led to “decisions” in Wald’s sense, originated in several misapprehensions and has led, apparently, to several more.

The three phrases examined here, with a view to elucidating they fallacies they embody, are:

“Repeated sampling from the same population”,
Errors of the “second kind”,
“Inductive behavior”.

Mathematicians without personal contact with the Natural Sciences have often been misled by such phrases. The errors to which they lead are not only numerical.

To continue reading Fisher’s paper.

“Note on an Article by Sir Ronald Fisher“

by Jerzy Neyman (1956)

Neyman

Summary

Categories: E.S. Pearson, Fisher, Neyman, phil/history of stat | 1 Comment

Happy Birthday R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

Posted on February 17, 2019 by Mayo

17 February 1890–29 July 1962

Today is R.A. Fisher’s birthday. I will post some Fisherian items this week in recognition of it*. This paper comes just before the conflicts with Neyman and Pearson erupted. Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power. We may see them as ending up in a similar place while starting from different origins. I quote just the most relevant portions…the full article is linked below. Happy Birthday Fisher!

“Two New Properties of Mathematical Likelihood“

by R.A. Fisher, F.R.S.

Proceedings of the Royal Society, Series A, 144: 285-307 (1934)

The property that where a sufficient statistic exists, the likelihood, apart from a factor independent of the parameter to be estimated, is a function only of the parameter and the sufficient statistic, explains the principle result obtained by Neyman and Pearson in discussing the efficacy of tests of significance. Neyman and Pearson introduce the notion that any chosen test of a hypothesis H₀ is more powerful than any other equivalent test, with regard to an alternative hypothesis H₁, when it rejects H₀ in a set of samples having an assigned aggregate frequency ε when H₀ is true, and the greatest possible aggregate frequency when H₁ is true. If any group of samples can be found within the region of rejection whose probability of occurrence on the hypothesis H₁ is less than that of any other group of samples outside the region, but is not less on the hypothesis H₀, then the test can evidently be made more powerful by substituting the one group for the other. Continue reading →

Categories: Fisher, phil/history of stat, Phil6334/ Econ 6614, Statistics | Tags: Bayesianism, induction, Ronald Fisher, significance tests | Leave a comment

Egon Pearson’s Heresy

Posted on August 11, 2018 by Mayo

E.S. Pearson: 11 Aug 1895-12 June 1980.

Today is Egon Pearson’s birthday. In honor of his birthday, I am posting “Statistical Concepts in Their Relation to Reality” (Pearson 1955). I’ve posted it several times over the years, but always find a new gem or two, despite its being so short. E. Pearson rejected some of the familiar tenets that have come to be associated with Neyman and Pearson (N-P) statistical tests, notably the idea that the essential justification for tests resides in a long-run control of rates of erroneous interpretations–what he termed the “behavioral” rationale of tests. In an unpublished letter E. Pearson wrote to Birnbaum (1974), he talks about N-P theory admitting of two interpretations: behavioral and evidential:

“I think you will pick up here and there in my own papers signs of evidentiality, and you can say now that we or I should have stated clearly the difference between the behavioral and evidential interpretations. Certainly we have suffered since in the way the people have concentrated (to an absurd extent often) on behavioral interpretations”.

Continue reading →

Categories: phil/history of stat, Philosophy of Statistics, Statistics | Tags: E S Pearson, Egon Pearson, Statistical hypothesis testing | 2 Comments

“Intentions (in your head)” is the code word for “error probabilities (of a procedure)”: Allan Birnbaum’s Birthday

Posted on May 27, 2018 by Mayo

27 May 1923-1 July 1976

Today is Allan Birnbaum’s Birthday. Birnbaum’s (1962) classic “On the Foundations of Statistical Inference,” in Breakthroughs in Statistics (volume I 1993), concerns a principle that remains at the heart of today’s controversies in statistics–even if it isn’t obvious at first: the Likelihood Principle (LP) (also called the strong likelihood Principle SLP, to distinguish it from the weak LP [1]). According to the LP/SLP, given the statistical model, the information from the data are fully contained in the likelihood ratio. Thus, properties of the sampling distribution of the test statistic vanish (as I put it in my slides from this post)! But error probabilities are all properties of the sampling distribution. Thus, embracing the LP (SLP) blocks our error statistician’s direct ways of taking into account “biasing selection effects” (slide #10). [Posted earlier here.] Interesting, as seen in a 2018 post on Neyman, Neyman did discuss this paper, but had an odd reaction that I’m not sure I understand. (Check it out.) Continue reading →

Categories: Birnbaum, Birnbaum Brakes, frequentist/Bayesian, Likelihood Principle, phil/history of stat, Statistics | 7 Comments

R.A. Fisher: “Statistical methods and Scientific Induction”

Posted on February 20, 2018 by Mayo

I continue a week of Fisherian posts begun on his birthday (Feb 17). This is his contribution to the “Triad”–an exchange between Fisher, Neyman and Pearson 20 years after the Fisher-Neyman break-up. The other two are below. They are each very short and are worth your rereading.

17 February 1890 — 29 July 1962

“Statistical Methods and Scientific Induction”

by Sir Ronald Fisher (1955)

SUMMARY

The three phrases examined here, with a view to elucidating they fallacies they embody, are:

“Repeated sampling from the same population”,
Errors of the “second kind”,
“Inductive behavior”.

Mathematicians without personal contact with the Natural Sciences have often been misled by such phrases. The errors to which they lead are not only numerical.

To continue reading Fisher’s paper.

“Note on an Article by Sir Ronald Fisher“

by Jerzy Neyman (1956)

Neyman

Summary

(1) FISHER’S allegation that, contrary to some passages in the introduction and on the cover of the book by Wald, this book does not really deal with experimental design is unfounded. In actual fact, the book is permeated with problems of experimentation. (2) Without consideration of hypotheses alternative to the one under test and without the study of probabilities of the two kinds, no purely probabilistic theory of tests is possible. (3) The conceptual fallacy of the notion of fiducial distribution rests upon the lack of recognition that valid probability statements about random variables usually cease to be valid if the random variables are replaced by their particular values. The notorious multitude of “paradoxes” of fiducial theory is a consequence of this oversight. (4) The idea of a “cost function for faulty judgments” appears to be due to Laplace, followed by Gauss.

E.S. Pearson

“Statistical Concepts in Their Relation to Reality“.

by E.S. Pearson (1955)

Controversies in the field of mathematical statistics seem largely to have arisen because statisticians have been unable to agree upon how theory is to provide, in terms of probability statements, the numerical measures most helpful to those who have to draw conclusions from observational data. We are concerned here with the ways in which mathematical theory may be put, as it were, into gear with the common processes of rational thought, and there seems no reason to suppose that there is one best way in which this can be done. If, therefore, Sir Ronald Fisher recapitulates and enlarges on his views upon statistical methods and scientific induction we can all only be grateful, but when he takes this opportunity to criticize the work of others through misapprehension of their views as he has done in his recent contribution to this Journal (Fisher 1955 “Scientific Methods and Scientific Induction” ), it is impossible to leave him altogether unanswered.

In the first place it seems unfortunate that much of Fisher’s criticism of Neyman and Pearson’s approach to the testing of statistical hypotheses should be built upon a “penetrating observation” ascribed to Professor G.A. Barnard, the assumption involved in which happens to be historically incorrect. There was no question of a difference in point of view having “originated” when Neyman “reinterpreted” Fisher’s early work on tests of significance “in terms of that technological and commercial apparatus which is known as an acceptance procedure”. There was no sudden descent upon British soil of Russian ideas regarding the function of science in relation to technology and to five-year plans. It was really much simpler–or worse. The original heresy, as we shall see, was a Pearson one!…

To continue reading, “Statistical Concepts in Their Relation to Reality” click HERE

Categories: E.S. Pearson, fiducial probability, Fisher, Neyman, phil/history of stat, Phil6334/ Econ 6614 | 3 Comments

R. A. Fisher: How an Outsider Revolutionized Statistics (Aris Spanos)

Posted on February 19, 2018 by Mayo

In recognition of R.A. Fisher’s birthday on February 17….

‘R. A. Fisher: How an Outsider Revolutionized Statistics’

by Aris Spanos

“Fisher was a genius who almost single-handedly created the foundations for modern statistical science, without detailed study of his predecessors. When young he was ignorant not only of the Continental contributions but even of contemporary publications in English.” (p. 738)

Categories: Fisher, phil/history of stat, Spanos, Statistics | 3 Comments

phil/history of stat

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.