Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results
I generally find National Academy of Science (NAS) manifestos highly informative. I only gave a quick reading to around 3/4 of this one. I thank Hilda Bastian for twittering the link. Before giving my impressions, I’m interested to hear what readers think, whenever you get around to having a look. Here’s from the intro*:
Questions about the reproducibility of scientific research have been raised in numerous settings and have gained visibility through several high-profile journal and popular press articles. Quantitative issues contributing to reproducibility challenges have been considered (including improper data management and analysis, inadequate statistical expertise, and incomplete data, among others), but there is no clear consensus on how best to approach or to minimize these problems…
3 years ago…
MONTHLY MEMORY LANE: 3 years ago: February 2013. I mark in red three posts that seem most apt for general background on key issues in this blog . Posts that are part of a “unit” or a group of “U-Phils”(you [readers] philosophize) count as one. Feb. 2013 reminds me how much the issue of the Likelihood Principle figured in this blog. I group the 4 on the Likelihood Principle, in burgundy, as one. Those unaware of the issue, or updating a statistics text in the next few months, might want to see what all the hoopla is about. (For the latest, please see ). The three in green are on Fisher. New questions or comments on any posts can be placed on this post.
- (2/2) U-Phil: Ton o’ Bricks
- (2/4) January Palindrome Winner
- (2/6) Mark Chang (now) gets it right about circularity
- (2/8) From Gelman’s blog: philosophy and the practice of Bayesian statistics
- (2/9) New kvetch: Filly Fury
- (2/10) U-PHIL: Gandenberger & Hennig: Blogging Birnbaum’s Proof
- (2/11) U-Phil: Mayo’s response to Hennig and Gandenberger
- (2/13) Statistics as a Counter to Heavyweights…who wrote this?
- (2/16) Fisher and Neyman after anger management?
- (2/17) R. A. Fisher: how an outsider revolutionized statistics
- (2/20) Fisher: from ‘Two New Properties of Mathematical Likelihood’
- (2/23) Stephen Senn: Also Smith and Jones
- (2/26) PhilStock: DO < $70
- (2/26) Statistically speaking…
 I exclude those reblogged fairly recently. Monthly memory lanes began at the blog’s 3-year anniversary in Sept, 2014.
 The discussion culminated in this publication in Statistical Science. For a very informal, final, look, see this post.
This continues my previous post: “Can’t take the fiducial out of Fisher…” in recognition of Fisher’s birthday, February 17. I supply a few more intriguing articles you may find enlightening to read and/or reread on a Saturday night
Move up 20 years to the famous 1955/56 exchange between Fisher and Neyman. Fisher clearly connects Neyman’s adoption of a behavioristic-performance formulation to his denying the soundness of fiducial inference. When “Neyman denies the existence of inductive reasoning, he is merely expressing a verbal preference. For him ‘reasoning’ means what ‘deductive reasoning’ means to others.” (Fisher 1955, p. 74).
Fisher was right that Neyman’s calling the outputs of statistical inferences “actions” merely expressed Neyman’s preferred way of talking. Nothing earth-shaking turns on the choice to dub every inference “an act of making an inference”.[i] The “rationality” or “merit” goes into the rule. Neyman, much like Popper, had a good reason for drawing a bright red line between his use of probability (for corroboration or probativeness) and its use by ‘probabilists’ (who assign probability to hypotheses). Fisher’s Fiducial probability was in danger of blurring this very distinction. Popper said, and Neyman would have agreed, that he had no problem with our using the word induction so long it was kept clear it meant testing hypotheses severely. Continue reading
R.A. Fisher: February 17, 1890 – July 29, 1962
In recognition of R.A. Fisher’s birthday today, I’ve decided to share some thoughts on a topic that has so far has been absent from this blog: Fisher’s fiducial probability. Happy Birthday Fisher.
[Neyman and Pearson] “began an influential collaboration initially designed primarily, it would seem to clarify Fisher’s writing. This led to their theory of testing hypotheses and to Neyman’s development of confidence intervals, aiming to clarify Fisher’s idea of fiducial intervals (D.R.Cox, 2006, p. 195).
The entire episode of fiducial probability is fraught with minefields. Many say it was Fisher’s biggest blunder; others suggest it still hasn’t been understood. The majority of discussions omit the side trip to the Fiducial Forest altogether, finding the surrounding brambles too thorny to penetrate. Besides, a fascinating narrative about the Fisher-Neyman-Pearson divide has managed to bloom and grow while steering clear of fiducial probability–never mind that it remained a centerpiece of Fisher’s statistical philosophy. I now think that this is a mistake. It was thought, following Lehman (1993) and others, that we could take the fiducial out of Fisher and still understand the core of the Neyman-Pearson vs Fisher (or Neyman vs Fisher) disagreements. We can’t. Quite aside from the intrinsic interest in correcting the “he said/he said” of these statisticians, the issue is intimately bound up with the current (flawed) consensus view of frequentist error statistics.
So what’s fiducial inference? I follow Cox (2006), adapting for the case of the lower limit: Continue reading
Polling estimation & rubbing off
Nate Silver describes “How we’re forecasting the primaries” using confidence intervals. Never mind that the estimates are a few weeks old, and put entirely to one side any predictions he makes or will make. I’m only interested in this one interpretive portion of the method, as Silver describes it:
In our interactive, you’ll see a bunch of funky-looking curves like the ones below for each candidate; they represent the model’s estimate of the possible distribution of his vote share. The red part of the curve represents a candidate’s 80 percent confidence interval. If the model is calibrated correctly, then he should finish within this range 80 percent of the time, above it 10 percent of the time, and below it 10 percent of the time. (My emphasis.)
OK. But when we look up the link to confidence interval, this seems to fall squarely within (what is correctly described as) the incorrect way to interpret intervals. Continue reading