My “April 1” posts for the past 5 years have been so close to the truth or possible truth that they weren’t always spotted as April Fool’s pranks, which is what made them genuine April Fool’s pranks. (After a few days I labeled them as such, or revealed it in a comment). So since it’s Saturday night on the last night of April, I’m reblogging my 5 posts from first days of April. (Which fooled you the most?) Continue reading
Monthly Archives: April 2016
MONTHLY MEMORY LANE: 3 years ago–March & April 2013. I missed March memory lane, so both are combined here. I mark in red three posts most apt for a general background on key issues in this blog . I’ve added some remarks in blue this month, for some of the posts that are not marked in red.
- (3/1) capitalizing on chance-Worth a look (has a pic of Mayo gambling)!
- (3/4) Big Data or Pig Data?–Funny & clever(guest post)!
- (3/7) Stephen Senn: Casting Stones
- (3/10) Blog Contents 2013 (Jan & Feb)
- (3/11) S. Stanley Young: Scientific Integrity and Transparency
- (3/13) Risk-Based Security: Knives and Axes-Funny, strange!
- (3/15) Normal Deviate: Double Misunderstandings About p-values–worth keeping in mind.
- (3/17) Update on Higgs data analysis: statistical flukes (1)
- (3/21) Telling the public why the Higgs particle matters
- (3/23) Is NASA suspending public education and outreach?
- (3/27) Higgs analysis and statistical flukes (part 2)
- (3/31) possible progress on the comedy hour circuit?–One of my favorites, a bit of progress
In honor of Jerzy Neyman’s birthday today, a local acting group is putting on a short theater production based on a screenplay I wrote: “Les Miserables Citations” (“Those Miserable Quotes”) . The “miserable” citations are those everyone loves to cite, from their early joint 1933 paper:
We are inclined to think that as far as a particular hypothesis is concerned, no test based upon the theory of probability can by itself provide any valuable evidence of the truth or falsehood of that hypothesis.
But we may look at the purpose of tests from another viewpoint. Without hoping to know whether each separate hypothesis is true or false, we may search for rules to govern our behavior with regard to them, in following which we insure that, in the long run of experience, we shall not be too often wrong. (Neyman and Pearson 1933, pp. 290-1).
When the rejection ratio (1 – β)/α turns evidence on its head, for those practicing in an error-statistical tribe (ii)
I’m about to hear Jim Berger give a keynote talk this afternoon at a FUSION conference I’m attending. The conference goal is to link Bayesian, frequentist and fiducial approaches: BFF. (Program is here. See the blurb below ). April 12 update below*. Berger always has novel and intriguing approaches to testing, so I was especially curious about the new measure. It’s based on a 2016 paper by Bayarri, Benjamin, Berger, and Sellke (BBBS 2016): Rejection Odds and Rejection Ratios: A Proposal for Statistical Practice in Testing Hypotheses. They recommend:
“that researchers should report what we call the ‘pre-experimental rejection ratio’ when presenting their experimental design and researchers should report what we call the ‘post-experimental rejection ratio’ (or Bayes factor) when presenting their experimental results.” (BBBS 2016)….
“The (pre-experimental) ‘rejection ratio’ Rpre , the ratio of statistical power to significance threshold (i.e., the ratio of the probability of rejecting under H1 and H0 respectively), is shown to capture the strength of evidence in the experiment for H1 over H0 .”
If you’re seeking a comparative probabilist measure, the ratio of power/size can look like a likelihood ratio in favor of the alternative. To a practicing member of an error statistical tribe, however, whether along the lines of N, P, or F (Neyman, Pearson or Fisher), things can look topsy turvy. Continue reading
Manan Shah channels Jack Nicholson in “The Shining” to win this month’s palindrome contest (and the book of his choice).*
Winner of March 2016 Contest: Manan Shah
Palindrome: I was able to. I did add well. Liking is, I say, as evil as dad’s aloof. Delivery reviled sign: “I red rum”. Examine men I’m axe murdering. Is delivery reviled? Fool! As dad’s alive, say as I sign: “I kill lewd dad.” Idiot Elba saw I.
The requirements: In addition to using Elba, a candidate for a winning palindrome must have used examine (or examined or examination).
Bio: Manan Shah is a mathematician and owner of Think. Plan. Do. LLC (www.ThinkPlanDoLLC.com). He writes at www.mathmisery.com and is looking to publish his first book, hopefully by the end of this year. He holds a PhD in Mathematics from Florida State University.
I’ll be speaking at U of Minnesota tomorrow. I’m glad to see a group with interest in philosophical foundations of statistics as well as the foundations of experiment and measurement in psychology. I will post my slides afterwards. Come by if you’re in the neighborhood.
University of Minnesota
“The ASA (2016) Statement on P-values and
How to Stop Refighting the Statistics Wars”
April 8, 2016 at 3:35 p.m.
Deborah G. Mayo
Department of Philosophy, Virginia Tech
The CLA Quantitative Methods
Minnesota Center for Philosophy of Science
275 Nicholson Hall
216 Pillsbury Drive SE
University of Minnesota
This will be a mixture of my current take on the “statistics wars” together with my reflections on the recent ASA document on P-values. I was invited over a year ago already by Niels Waller, a co-author of Paul Meehl. I’ll never forget when I was there in 1997: Paul Meehl was in the audience, waving my book in the air–EGEK (1996)–and smiling!
I could have told them that the degree of accordance enabling the “6 principles” on p-values was unlikely to be replicated when it came to most of the “other approaches” with which some would supplement or replace significance tests– notably Bayesian updating, Bayes factors, or likelihood ratios (confidence intervals are dual to hypotheses tests). [My commentary is here.] So now they may be advising a “hold off” or “go slow” approach until some consilience is achieved. Is that it? I don’t know. I was tweeted an article about the background chatter taking place behind the scenes; I wasn’t one of people interviewed for this. Here are some excerpts, I may add more later after it has had time to sink in. (check back later)
“Reaching for Best Practices in Statistics: Proceed with Caution Until a Balanced Critique Is In”
“[A]ll of the other approaches*, as well as most statistical tools, may suffer from many of the same problems as the p-values do. What level of likelihood ratio in favor of the research hypothesis will be acceptable to the journal? Should scientific discoveries be based on whether posterior odds pass a specific threshold (P3)? Does either measure the size of an effect (P5)?…How can we decide about the sample size needed for a clinical trial—however analyzed—if we do not set a specific bright-line decision rule? 95% confidence intervals or credence intervals…offer no protection against selection when only those that do not cover 0, are selected into the abstract (P4). (Benjamini, ASA commentary, pp. 3-4)
What’s sauce for the goose is sauce for the gander right? Many statisticians seconded George Cobb who urged “the board to set aside time at least once every year to consider the potential value of similar statements” to the recent ASA p-value report. Disappointingly, a preliminary survey of leaders in statistics, many from the original p-value group, aired striking disagreements on best and worst practices with respect to these other approaches. The Executive Board is contemplating a variety of recommendations, minimally, Continue reading