Monthly Archives: November 2015

Return to the Comedy Hour: P-values vs posterior probabilities (1)

Posted on November 28, 2015 by Mayo

Comedy Hour

Some recent criticisms of statistical tests of significance have breathed brand new life into some very old howlers, many of which have been discussed on this blog. One variant that returns to the scene every decade I think (for 50+ years?), takes a “disagreement on numbers” to show a problem with significance tests even from a “frequentist” perspective. Since it’s Saturday night, let’s listen in to one of the comedy hours from 3 years ago (0) (new notes in red):

Did you hear the one about the frequentist significance tester when he was shown the nonfrequentist nature of p-values?

JB [Jim Berger]: I just simulated a long series of tests on a pool of null hypotheses, and I found that among tests with p-values of .05, at least 22%—and typically over 50%—of the null hypotheses are true!(1)

Frequentist Significance Tester: Scratches head: But rejecting the null with a p-value of .05 ensures erroneous rejection no more than 5% of the time!

Raucous laughter ensues!

(Hah, hah,…. I feel I’m back in high school: “So funny, I forgot to laugh!)

The frequentist tester should retort:

Frequentist Significance Tester: But you assumed 50% of the null hypotheses are true, and computed P(H₀|x) (imagining P(H₀)= .5)—and then assumed my p-value should agree with the number you get, if it is not to be misleading!

Yet, our significance tester is not heard from as they move on to the next joke…. Continue reading →

Categories: Bayesian/frequentist, Comedy, PBP, significance tests, Statistics | 27 Comments

3 YEARS AGO (NOVEMBER 2012): MEMORY LANE

Posted on November 25, 2015 by Mayo

3 years ago…

MONTHLY MEMORY LANE: 3 years ago: November 2012. I mark in red three posts that seem most apt for general background on key issues in this blog.[1]. Please check out others that didn’t make the “bright red cut”. If you’re interested in the Likelihood Principle, check “Blogging Birnbaum” and “Likelihood Links”. If you think P-values are hard to explain, see how the “Bad News Bears” struggle to decipher Bayesian probability. (Some of the posts allude to seminars I was giving at the London School of Economics 3 years ago.)

November 2012

(11/04) PhilStat: So you’re looking for a Ph.D. dissertation topic?
(11/07) Seminars at the London School of Economics: Contemporary Problems in Philosophy of Statistics
(11/10) Bad news bears: ‘Bayesian bear’ rejoinder – reblog
(11/12) new rejected post: kvetch (and query)
(11/14) continuing the comments. …
(11/16) Philosophy of Science Association (PSA) 2012 Program
(11/18) What is Bayesian/Frequentist Inference? (from the normal deviate)
(11/18) New kvetch/PhilStock: Rapiscan Scam
(11/19) Comments on Wasserman’s “what is Bayesian/frequentist inference?”
(11/21) Irony and Bad Faith: Deconstructing Bayesians – reblog
(11/23) Announcement: 28 November: My Seminar at the LSE (Contemporary PhilStat)
(11/25) Likelihood Links [for 28 Nov. Seminar and Current U-Phil]
(11/28) Blogging Birnbaum: on Statistical Methods in Scientific Inference
(11/30) Error Statistics (brief overview)

[1] I exclude those reblogged fairly recently. Posts that are part of a “unit” or a group of “U-Phils” count as one. Monthly memory lanes began at the blog’s 3-year anniversary in Sept, 2014.

Categories: 3-year memory lane, Statistics | 1 Comment

Erich Lehmann: Neyman-Pearson & Fisher on P-values

Posted on November 20, 2015 by Mayo

lone book on table

Today is Erich Lehmann’s birthday (20 November 1917 – 12 September 2009). Lehmann was Neyman’s first student at Berkeley (Ph.D 1942), and his framing of Neyman-Pearson (NP) methods has had an enormous influence on the way we typically view them.

I got to know Erich in 1997, shortly after publication of EGEK (1996). One day, I received a bulging, six-page, handwritten letter from him in tiny, extremely neat scrawl (and many more after that). He began by telling me that he was sitting in a very large room at an ASA (American Statistical Association) meeting where they were shutting down the conference book display (or maybe they were setting it up), and on a very long, wood table sat just one book, all alone, shiny red. He said he wondered if it might be of interest to him! So he walked up to it…. It turned out to be my Error and the Growth of Experimental Knowledge (1996, Chicago), which he reviewed soon after[0]. (What are the chances?) Some related posts on Lehmann’s letter are here and here.

One of Lehmann’s more philosophical papers is Lehmann (1993), “The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two?” We haven’t discussed it before on this blog. Here are some excerpts (blue), and remarks (black)

Erich Lehmann 20 November 1917 – 12 September 2009

…A distinction frequently made between the approaches of Fisher and Neyman-Pearson is that in the latter the test is carried out at a fixed level, whereas the principal outcome of the former is the statement of a p value that may or may not be followed by a pronouncement concerning significance of the result [p.1243].

The history of this distinction is curious. Throughout the 19th century, testing was carried out rather informally. It was roughly equivalent to calculating an (approximate) p value and rejecting the hypothesis if this value appeared to be sufficiently small. … Fisher, in his 1925 book and later, greatly reduced the needed tabulations by providing tables not of the distributions themselves but of selected quantiles. … These tables allow the calculation only of ranges for the p values; however, they are exactly suited for determining the critical values at which the statistic under consideration becomes significant at a given level. As Fisher wrote in explaining the use of his [chi square] table (1946, p. 80):

In preparing this table we have borne in mind that in practice we do not want to know the exact value of P for any observed [chi square], but, in the first place, whether or not the observed value is open to suspicion. If P is between .1 and .9, there is certainly no reason to suspect the hypothesis tested. If it is below .02, it is strongly indicated that the hypothesis fails to account for the whole of the facts. We shall not often be astray if we draw a conventional line at .05 and consider that higher values of [chi square] indicate a real discrepancy.

Similarly, he also wrote (1935, p. 13) that “it is usual and convenient for experimenters to take 5 percent as a standard level of significance, in the sense that they are prepared to ignore all results which fail to reach this standard .. .” …. Continue reading →

Categories: Neyman, P-values, phil/history of stat, Statistics | Tags: Error and the Growth of Experimental Knowledge review, Lehmann | 4 Comments

“What does it say about our national commitment to research integrity?”

Posted on November 13, 2015 by Mayo

There’s an important guest editorial by Keith Baggerly and C.K. Gunsalus in today’s issue of the Cancer Letter: “Penalty Too Light” on the Duke U. (Potti/Nevins) cancer trial fraud*. Here are some excerpts.

publication date: Nov 13, 2015

Penalty Too Light

What does it say about our national commitment to research integrity that the Department of Health and Human Services’ Office of Research Integrity has concluded that a five-year ban on federal research funding for one individual researcher is a sufficient response to a case involving millions of taxpayer dollars, completely fabricated data, and hundreds to thousands of patients in invasive clinical trials?

This week, ORI released a notice of “final action” in the case of Anil Potti, M.D. The ORI found that Dr. Potti engaged in several instances of research misconduct and banned him from receiving federal funding for five years.

(See my previous post.)

The principles involved are important and the facts complicated. This was not just a matter of research integrity. This was also a case involving direct patient care and millions of dollars in federal and other funding. The duration and extent of deception were extreme. The case catalyzed an Institute of Medicine review of genomics in clinical trials and attracted national media attention.

If there are no further conclusions coming from ORI and if there are no other investigations under way—despite the importance of the issues involved and the five years that have elapsed since research misconduct investigation began, we do not know—a strong argument can be made that neither justice nor the research community have been served by this outcome. Continue reading →

Categories: Anil Potti, fraud, science communication, Statistics | 3 Comments

Findings of the Office of Research Misconduct on the Duke U (Potti/Nevins) cancer trial fraud: No one is punished but the patients

Posted on November 9, 2015 by Mayo

Findings of Research Misconduct
A Notice by the Health and Human Services Dept

on 11/09/2015
AGENCY: Office of the Secretary, HHS.
ACTION: Notice.

-----------------------------------------------------------------------

SUMMARY: Notice is hereby given that the Office of Research Integrity 
(ORI) has taken final action in the following case:
    Anil Potti, M.D., Duke University School of Medicine: Based on the 
reports of investigations conducted by Duke University School of 
Medicine (Duke) and additional analysis conducted by ORI in its 
oversight review, ORI found that Dr. Anil Potti, former Associate 
Professor of Medicine, Duke, engaged in research misconduct in research 
supported by National Heart, Lung, and Blood Institute (NHLBI), 
National Institutes of Health (NIH), grant R01 HL072208 and National 
Cancer Institute (NCI), NIH, grants R01 CA136530, R01 CA131049, K12 
CA100639, R01 CA106520, and U54 CA112952.
    ORI found that Respondent engaged in research misconduct by 
including false research data in the following published papers, 
submitted manuscript, grant application, and the research record as 
specified in 1-3 below. Specifically, ORI found that: Continue reading →

Categories: Anil Potti, reproducibility, Statistical fraudbusting, Statistics | 12 Comments

S. McKinney: On Efron’s “Frequentist Accuracy of Bayesian Estimates” (Guest Post)

Posted on November 5, 2015 by Mayo

Steven McKinney, Ph.D.
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

On Bradley Efron’s: “Frequentist Accuracy of Bayesian Estimates”

Bradley Efron has produced another fine set of results, yielding a valuable estimate of variability for a Bayesian estimate derived from a Markov Chain Monte Carlo algorithm, in his latest paper “Frequentist accuracy of Bayesian estimates” (J. R. Statist. Soc. B (2015) 77, Part 3, pp. 617–646). I give a general overview of Efron’s brilliance via his Introduction discussion (his words “in double quotes”).

“1. Introduction

The past two decades have witnessed a greatly increased use of Bayesian techniques in statistical applications. Objective Bayes methods, based on neutral or uniformative priors of the type pioneered by Jeffreys, dominate these applications, carried forward on a wave of popularity for Markov chain Monte Carlo (MCMC) algorithms. Good references include Ghosh (2011), Berger (2006) and Kass and Wasserman (1996).”

A nice concise summary, one that should bring joy to anyone interested in Bayesian methods after all the Bayesian-bashing of the middle 20th century. Efron himself has crafted many beautiful results in the Empirical Bayes arena. He has reviewed important differences between Bayesian and frequentist outcomes that point to some as-yet unsettled issues in statistical theory and philosophy such as his scales of evidence work. Continue reading →

Categories: Bayesian/frequentist, objective Bayesians, Statistics | 44 Comments

Monthly Archives: November 2015

Return to the Comedy Hour: P-values vs posterior probabilities (1)

3 YEARS AGO (NOVEMBER 2012): MEMORY LANE

Erich Lehmann: Neyman-Pearson & Fisher on P-values

“What does it say about our national commitment to research integrity?”

Findings of the Office of Research Misconduct on the Duke U (Potti/Nevins) cancer trial fraud: No one is punished but the patients

S. McKinney: On Efron’s “Frequentist Accuracy of Bayesian Estimates” (Guest Post)

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.