Monthly Archives: January 2016

3 YEARS AGO (JANUARY 2013): MEMORY LANE

Posted on January 29, 2016 by Mayo

3 years ago…

MONTHLY MEMORY LANE: 3 years ago: January 2013. I mark in red three posts that seem most apt for general background on key issues in this blog [1]. Posts that are part of a “unit” or a group of “U-Phils”(you [readers] philosophize) count as one. It was tough to pick just 3 this month. I’m putting the 2 “U-Phils” in burgundy–nearly red. They involve reader contributions on the likelihood principle–a major topic in foundations of statistics. Please check out the others. New questions or comments can be placed on this post.

January 2013

(1/2) Severity as a ‘Metastatistical’ Assessment
(1/4) Severity Calculator
(1/6) Guest post: Bad Pharma? (S. Senn)
(1/9) RCTs, skeptics, and evidence-based policy
(1/10) James M. Buchanan
(1/11) Aris Spanos: James M. Buchanan: a scholar, teacher and friend
(1/12) Error Statistics Blog: Table of Contents
(1/15) Ontology & Methodology: Second call for Abstracts, Papers
(1/18) New Kvetch/PhilStock
(1/19) Saturday Night Brainstorming and Task Forces: (2013) TFSI on NHST (2015 update).
(1/22) New PhilStock
(1/23) P-values as posterior odds?
(1/26) Coming up: December U-Phil Contributions….
(1/27) U-Phil: S. Fletcher & N.Jinn
(1/30) U-Phil: J. A. Miller: Blogging the SLP

[1] I exclude those reblogged fairly recently. Monthly memory lanes began at the blog’s 3-year anniversary in Sept, 2014.

Categories: 3-year memory lane, Statistics | Leave a comment

Hocus pocus! Adopt a magician’s stance, if you want to reveal statistical sleights of hand

Posted on January 24, 2016 by Mayo

When they sought to subject Uri Geller to the scrutiny of scientists, magicians had to be brought in because only they were sufficiently trained to spot the subtle sleight of hand shifts by which the magician tricks by misdirection. We, too, have to be magicians to discern the subtle misdirections and shifts of meaning in the discussions of statistical significance tests (and other methods)—even by the same statistical guide. We needn’t suppose anything deliberately devious is going on at all! Often, the statistical guidebook reflects shifts of meaning that grow out of one or another critical argument. These days, they trickle down quickly to statistical guidebooks, thanks to popular articles on the “statistics crisis in science”. The danger is that their own guidebooks contain inconsistencies. To adopt the magician’s stance is to be on the lookout for standard sleights of hand. There aren’t that many.[0]

I don’t know Jim Frost, but he gives statistical guidance at the minitab blog. The purpose of my previous post is to point out that Frost uses the probability of a Type I error in two incompatible ways in his posts on significance tests. I assumed he’d want to clear this up, but so far he has not. His response to a comment I made on his blog is this: Continue reading →

Categories: P-values, reforming the reformers, statistical tests | 2 Comments

High error rates in discussions of error rates (1/21/16 update)

Posted on January 19, 2016 by Mayo

27D0BB5300000578-3168627-image-a-27_1437433320306

waiting for the other shoe to drop…

Do you ever find yourself holding your breath when reading an exposition of significance tests that’s going swimmingly so far? If you’re a frequentist in exile, you know what I mean. I’m sure others feel this way too. When I came across Jim Frost’s posts on The Minitab Blog, I thought I might actually have located a success story. He does a good job explaining P-values (with charts), the duality between P-values and confidence levels, and even rebuts the latest “test ban” (the “Don’t Ask, Don’t Tell” policy). Mere descriptive reports of observed differences that the editors recommend, Frost shows, are uninterpretable without a corresponding P-value or the equivalent. So far, so good. I have only small quibbles, such as the use of “likelihood” when meaning probability, and various and sundry nitpicky things. But watch how in some places significance levels are defined as the usual error probabilities and error rates—indeed in the glossary for the site—while in others it is denied they provide error rates. In those other places, error probabilities and error rates shift their meaning to posterior probabilities, based on priors representing the “prevalence” of true null hypotheses. Continue reading →

Categories: highly probable vs highly probed, J. Berger, reforming the reformers, Statistics | 11 Comments

“P-values overstate the evidence against the null”: legit or fallacious?

Posted on January 17, 2016 by Mayo

The allegation that P-values overstate the evidence against the null hypothesis continues to be taken as gospel in discussions of significance tests. All such discussions, however, assume a notion of “evidence” that’s at odds with significance tests–generally likelihood ratios, or Bayesian posterior probabilities (conventional or of the “I’m selecting hypotheses from an urn of nulls” variety). I’m reblogging the bulk of an earlier post as background for a new post to appear tomorrow. It’s not that a single small P-value provides good evidence of a discrepancy (even assuming the model, and no biasing selection effects); Fisher and others warned against over-interpreting an “isolated” small P-value long ago. The problem is that the current formulation of the “P-values overstate the evidence” meme is attached to a sleight of hand (on meanings) that is introducing brand new misinterpretations into an already confused literature!

1. What you should ask…

When you hear the familiar refrain, “We all know that P-values overstate the evidence against the null hypothesis”, denying the P-value aptly measures evidence, what you should ask is:

“What do you mean by overstating the evidence against a hypothesis?”

One honest answer is:

“What I mean is that when I put a lump of prior probability π₀ > 1/2 on a point null H₀(or a very small interval around it), the P-value is smaller than my Bayesian posterior probability on H₀.”

Your reply might then be: (a) P-values are not intended as posteriors in H₀ and (b) P-values can be used to determine whether there is evidence of inconsistency with a null hypothesis at various levels, and to distinguish how well or poorly tested claims are–depending on the type of question asked. A report on the discrepancies “poorly” warranted is what controls any overstatements about discrepancies indicated.

You might toss in the query: Why do you assume that “the” correct measure of evidence (for scrutinizing the P-value) is via the Bayesian posterior?

If you wanted to go even further you might rightly ask: And by the way, what warrants your lump of prior to the null? (See Section 3. A Dialogue.) Continue reading →

Categories: Bayesian/frequentist, fallacy of rejection, highly probable vs highly probed, P-values | 3 Comments

“On the Brittleness of Bayesian Inference,” Owhadi and Scovel (PUBLISHED)

Posted on January 11, 2016 by Mayo

The record number of hits on this blog goes to “When Bayesian Inference shatters,” where Houman Owhadi presents a “Plain Jane” explanation of results now published in “On the Brittleness of Bayesian Inference”. A follow-up was 1 year ago. Here’s how their paper begins:

Houman Owhadi
Professor of Applied and Computational Mathematics and Control and Dynamical Systems, Computing + Mathematical Sciences,
California Institute of Technology, USA+

Clint Scovel
Senior Scientist,
Computing + Mathematical Sciences,
California Institute of Technology, USA

“On the Brittleness of Bayesian Inference”

ABSTRACT: With the advent of high-performance computing, Bayesian methods are becoming increasingly popular tools for the quantification of uncertainty throughout science and industry. Since these methods can impact the making of sometimes critical decisions in increasingly complicated contexts, the sensitivity of their posterior conclusions with respect to the underlying models and prior beliefs is a pressing question to which there currently exist positive and negative answers. We report new results suggesting that, although Bayesian methods are robust when the number of possible outcomes is finite or when only a finite number of marginals of the data-generating distribution are unknown, they could be generically brittle when applied to continuous systems (and their discretizations) with finite information on the data-generating distribution. If closeness is defined in terms of the total variation (TV) metric or the matching of a finite system of generalized moments, then (1) two practitioners who use arbitrarily close models and observe the same (possibly arbitrarily large amount of) data may reach opposite conclusions; and (2) any given prior and model can be slightly perturbed to achieve any desired posterior conclusion. The mechanism causing brittleness/robustness suggests that learning and robustness are antagonistic requirements, which raises the possibility of a missing stability condition when using Bayesian inference in a continuous world under finite information.

Categories: Bayesian/frequentist, Statistics | 16 Comments

Winner of December Palindrome: Mike Jacovides

Posted on January 10, 2016 by Mayo

Winner of the December 2015 Palindrome contest

Mike Jacovides: Associate Professor of Philosophy at Purdue University

Palindrome: Emo, notable Stacy began a memory by Rome. Manage by cats, Elba to Nome.

The requirement: A palindrome using “memory” or “memories” (and Elba, of course).

Book choice (out of 12 or more): Error and the Growth of Experimental Knowledge (D. Mayo 1996, Chicago)

Bio: Mike Jacovides is an Associate Professor of Philosophy at Purdue University. He’s just finishing a book whose title is constantly changing, but which may end up being called Locke’s Image of the World and the Scientific Revolution.

Statement: My interest in palindromes was sparked by my desire to learn more about the philosophy of statistics. The fact that you can learn about the philosophy of statistics by writing a palindrome seems like evidence that anything can cause anything, but maybe once I read the book, I’ll learn that it isn’t. I am glad that ‘emo, notable Stacy’ worked out, I have to say.

Congratulations Mike! I hope you’ll continue to pursue philosophy of statistics! We need much more of that. Good choice of book prize too. D. Mayo Continue reading →

Categories: Announcement, Palindrome | 1 Comment

Preregistration Challenge: My email exchange

Posted on January 8, 2016 by Mayo

David Mellor, from the Center for Open Science, emailed me asking if I’d announce his Preregistration Challenge on my blog, and I’m glad to do so. You win $1,000 if your properly preregistered paper is published. The recent replication effort in psychology showed, despite the common refrain – “it’s too easy to get low P-values” – that in preregistered replication attempts it’s actually very difficult to get small P-values. (I call this the “paradox of replication”[1].) Here’s our e-mail exchange from this morning:

Dear Deborah Mayod,

I’m reaching out to individuals who I think may be interested in our recently launched competition, the Preregistration Challenge (https://cos.io/prereg). Based on your blogging, I thought it could be of interest to you and to your readers.

In case you are unfamiliar with it, preregistration specifies in advance the precise study protocols and analytical decisions before data collection, in order to separate the hypothesis-generating exploratory work from the hypothesis testing confirmatory work.

Though required by law in clinical trials, it is virtually unknown within the basic sciences. We are trying to encourage this new behavior by offering 1,000 researchers $1000 prizes for publishing the results of their preregistered work.

Please let me know if this is something you would consider blogging about or sharing in other ways. I am happy to discuss further.

Best,

David
David Mellor, PhD

Project Manager, Preregistration Challenge, Center for Open Science

Deborah Mayo To David: 10:33 AM (1 hour ago)

David: Yes I’m familiar with it, and I hope that it encourages people to avoid data-dependent determinations that bias results. It shows the importance of statistical accounts that can pick up on such biasing selection effects. On the other hand, coupling prereg with some of the flexible inference accounts now in use won’t really help. Moreover, there may, in some fields, be a tendency to research a non-novel, fairly trivial result.

And if they’re going to preregister, why not go blind as well? Will they?

Best,

Mayo Continue reading →

Categories: Announcement, preregistration, Statistical fraudbusting, Statistics | 11 Comments

Sir Harold Jeffreys’ (tail area) one-liner: Sat night comedy [draft ii]

Posted on January 2, 2016 by Mayo

This headliner appeared two years ago, but to a sparse audience (likely because it was during winter break), so Management’s giving him another chance…

You might not have thought there could be new material for 2014, but there is, and if you look a bit more closely, you’ll see that it’s actually not Jay Leno [1] who is standing up there at the mike ….

It’s Sir Harold Jeffreys himself! And his (very famous) joke, I admit, is funny. So, since it’s Saturday night, let’s listen in on Sir Harold’s howler* in criticizing the use of p-values.

“Did you hear the one about significance testers rejecting H₀ because of outcomes H₀ didn’t predict?

‘What’s unusual about that?’ you ask?

What’s unusual, is that they do it when these unpredicted outcomes haven’t even occurred!”

Much laughter.

[The actual quote from Jeffreys: Using p-values implies that “An hypothesis that may be true is rejected because it has failed to predict observable results that have not occurred. This seems a remarkable procedure.” (Jeffreys 1939, 316)]

I say it’s funny, so to see why I’ll strive to give it a generous interpretation. Continue reading →

Categories: Comedy, Fisher, Jeffreys, P-values | 9 Comments

Monthly Archives: January 2016

3 YEARS AGO (JANUARY 2013): MEMORY LANE

Hocus pocus! Adopt a magician’s stance, if you want to reveal statistical sleights of hand

High error rates in discussions of error rates (1/21/16 update)

“P-values overstate the evidence against the null”: legit or fallacious?

“On the Brittleness of Bayesian Inference,” Owhadi and Scovel (PUBLISHED)

Winner of December Palindrome: Mike Jacovides

Preregistration Challenge: My email exchange

Sir Harold Jeffreys’ (tail area) one-liner: Sat night comedy [draft ii]

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018. All Rights Reserved.