Given all the recent attention given to kvetching about significance tests, it’s an apt time to reblog Aris Spanos’ overview of the error statistician talking back to the critics . A related paper for your Saturday night reading is Mayo and Spanos (2011). It mixes the error statistical philosophy of science with its philosophy of statistics, introduces severity, and responds to 13 criticisms and howlers.
I’m going to comment on some of the ASA discussion contributions I hadn’t discussed earlier. Please share your thoughts in relation to any of this.
It was first blogged here, as part of our seminar 2 years ago.
 For those seeking a bit more balance to the main menu offered in the ASA Statistical Significance Reference list.
See also on this blog:
A. Spanos, “Recurring controversies about p-values and confidence intervals revisited”
A. Spanos, “Lecture on frequentist hypothesis testing
Comments get unwieldy after 100, so here’s a chance to continue the “due to chance” discussion in some roomier quarters. (There seems to be at least two distinct lanes being travelled.) Now one of the main reasons I run this blog is to discover potential clues to solving or making progress on thorny philosophical problems I’ve been wrangling with for a long time. I think I extracted some illuminating gems from the discussion here, but I don’t have time to write them up, and won’t for a bit, so I’ve parked a list of comments wherein the golden extracts lie (I think) over at my Rejected Posts blog. (They’re all my comments, but as influenced by readers, so I thank you!) Over there, there’s no “return and resubmit”, but around a dozen posts have eventually made it over here, tidied up. Please continue the discussion on this blog (I don’t even recommend going over there). You can link to your earlier comments by clicking on the date.
 The Spiegelhalter (PVP) link is here.
There’s something about “Principle 2” in the ASA document on p-values that I couldn’t address in my brief commentary, but is worth examining more closely.
2. P-values do not measure (a) the probability that the studied hypothesis is true , or (b) the probability that the data were produced by random chance alone,
(a) is true, but what about (b)? That’s what I’m going to focus on, because I think it is often misunderstood. It was discussed earlier on this blog in relation to the Higgs experiments and deconstructing “the probability the results are ‘statistical flukes'”. So let’s examine: Continue reading
My invited comments on the ASA Document on P-values*
The American Statistical Association is to be credited with opening up a discussion into p-values; now an examination of the foundations of other key statistical concepts is needed.
Statistical significance tests are a small part of a rich set of “techniques for systematically appraising and bounding the probabilities (under respective hypotheses) of seriously misleading interpretations of data” (Birnbaum 1970, p. 1033). These may be called error statistical methods (or sampling theory). The error statistical methodology supplies what Birnbaum called the “one rock in a shifting scene” (ibid.) in statistical thinking and practice. Misinterpretations and abuses of tests, warned against by the very founders of the tools, shouldn’t be the basis for supplanting them with methods unable or less able to assess, control, and alert us to erroneous interpretations of data. Continue reading
unscrambling soap words clears me of this deed (aosp)
Remember “Repligate”? [“Some Ironies in the Replication Crisis in Social Psychology“] and, more recently, the much publicized attempt to replicate 100 published psychology articles by the Open Science Collaboration (OSC) [“The Paradox of Replication“]? Well, some of the critics involved in Repligate have just come out with a criticism of the OSC results, claiming they’re way, way off in their low estimate of replications in psychology . (The original OSC report is here.) I’ve only scanned the critical article quickly, but some bizarre statistical claims leap out at once. (Where do they get this notion about confidence intervals?) It’s published in Science! There’s also a response from the OSC researchers. Neither group adequately scrutinizes the validity of many of the artificial experiments and proxy variables–an issue I’ve been on about for a while. Without firming up the statistics-research link, no statistical fixes can help. I’m linking to the articles here for your weekend reading. I invite your comments! For some reason a whole bunch of items of interest, under the banner of “statistics and the replication crisis,” are all coming out at around the same time, and who can keep up? March 7 brings yet more! (Stay tuned). Continue reading