Error Statistics Philosophy: Blog Contents (7 years) [i]
By: D. G. Mayo
Dear Reader: I began this blog 7 years ago (Sept. 3, 2011)! A big celebration is taking place at the Elbar Room this evening, both for the blog and the appearance of my new book: Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (CUP). While a special rush edition made an appearance on Sept 3, in time for the RSS meeting in Cardiff, it was decided to hold off on the festivities until copies of the book were officially available (yesterday)! If you’re in the neighborhood, stop by for some Elba Grease
Many of the discussions in the book were importantly influenced (corrected and improved) by reader’s comments on the blog over the years. I thank readers for their input. Please peruse the offerings below, taking advantage of the discussions by guest posters and readers! I posted the first 3 sections of Tour I (in Excursion i) here, here, and here.
This blog will return to life, although I’m not yet sure of exactly what form it will take. Ideas are welcome. The tone of a book differs from a blog, so we’ll have to see what voice emerges here.
D. Mayo Continue reading
4 years ago!
This is a belated birthday post for E.S. Pearson (11 August 1895-12 June, 1980). It’s basically a post from 2012 which concerns an issue of interpretation (long-run performance vs probativeness) that’s badly confused these days. I’ve recently been scouring around the history and statistical philosophies of Neyman, Pearson and Fisher for purposes of a book soon to be completed. I recently discovered a little anecdote that calls for a correction in something I’ve been saying for years. While it’s little more than a point of trivia, it’s in relation to Pearson’s (1955) response to Fisher (1955)–the last entry in this post. I’ll wait until tomorrow or the next day to share it, to give you a chance to read the background.
Are methods based on error probabilities of use mainly to supply procedures which will not err too frequently in some long run? (performance). Or is it the other way round: that the control of long run error properties are of crucial importance for probing the causes of the data at hand? (probativeness). I say no to the former and yes to the latter. This, I think, was also the view of Egon Sharpe (E.S.) Pearson.
Cases of Type A and Type B
“How far then, can one go in giving precision to a philosophy of statistical inference?” (Pearson 1947, 172)
I first blogged this letter here. Below the references are some more recent blog links of relevance to this issue.
Dear Reader: I am typing in some excerpts from a letter Stephen Senn shared with me in relation to my April 28, 2012 blogpost. It is a letter to the editor of Statistics in Medicine in response to S. Goodman. It contains several important points that get to the issues we’ve been discussing. You can read the full letter here. Sincerely, D. G. Mayo
STATISTICS IN MEDICINE, LETTER TO THE EDITOR
From: Stephen Senn*
Some years ago, in the pages of this journal, Goodman gave an interesting analysis of ‘replication probabilities’ of p-values. Specifically, he considered the possibility that a given experiment had produced a p-value that indicated ‘significance’ or near significance (he considered the range p=0.10 to 0.001) and then calculated the probability that a study with equal power would produce a significant result at the conventional level of significance of 0.05. He showed, for example, that given an uninformative prior, and (subsequently) a resulting p-value that was exactly 0.05 from the first experiment, the probability of significance in the second experiment was 50 per cent. A more general form of this result is as follows. If the first trial yields p=α then the probability that a second trial will be significant at significance level α (and in the same direction as the first trial) is 0.5. Continue reading
Junk Science (as first coined).* Have you ever noticed in wranglings over evidence-based policy that it’s always one side that’s politicizing the evidence—the side whose policy one doesn’t like? The evidence on the near side, or your side, however, is solid science. Let’s call those who first coined the term “junk science” Group 1. For Group 1, junk science is bad science that is used to defend pro-regulatory stances, whereas sound science would identify errors in reports of potential risk. (Yes, this was the first popular use of “junk science”, to my knowledge.) For the challengers—let’s call them Group 2—junk science is bad science that is used to defend the anti-regulatory stance, whereas sound science would identify potential risks, advocate precautionary stances, and recognize errors where risk is denied.
Both groups agree that politicizing science is very, very bad—but it’s only the other group that does it!
A given print exposé exploring the distortions of fact on one side or the other routinely showers wild praise on their side’s—their science’s and their policy’s—objectivity, their adherence to the facts, just the facts. How impressed might we be with the text or the group that admitted to its own biases? Continue reading