Monthly Archives: December 2014

Midnight With Birnbaum (Happy New Year)

 Just as in the past 3 years since I’ve been blogging, I revisit that spot in the road at 11p.m.*,just outside the Elbar Room, get into a strange-looking taxi, and head to “Midnight With Birnbaum”. I wonder if they’ll come for me this year, given that my Birnbaum article is out… This is what the place I am taken to looks like. [It’s 6 hrs later here, so I’m about to leave…]

You know how in that (not-so) recent movie, “Midnight in Paris,” the main character (I forget who plays it, I saw it on a plane) is a writer finishing a novel, and he steps into a cab that mysteriously picks him up at midnight and transports him back in time where he gets to run his work by such famous authors as Hemingway and Virginia Wolf?  He is impressed when his work earns their approval and he comes back each night in the same mysterious cab…Well, imagine an error statistical philosopher is picked up in a mysterious taxi at midnight (New Year’s Eve 2011 2012, 2013, 2014) and is taken back fifty years and, lo and behold, finds herself in the company of Allan Birnbaum.[i] There are a couple of brief (12/31/14) updates at the end.  



ERROR STATISTICIAN: It’s wonderful to meet you Professor Birnbaum; I’ve always been extremely impressed with the important impact your work has had on philosophical foundations of statistics.  I happen to be writing on your famous argument about the likelihood principle (LP).  (whispers: I can’t believe this!)

BIRNBAUM: Ultimately you know I rejected the LP as failing to control the error probabilities needed for my Confidence concept.

ERROR STATISTICIAN: Yes, but I actually don’t think your argument shows that the LP follows from such frequentist concepts as sufficiency S and the weak conditionality principle WLP.[ii]  Sorry,…I know it’s famous…

BIRNBAUM:  Well, I shall happily invite you to take any case that violates the LP and allow me to demonstrate that the frequentist is led to inconsistency, provided she also wishes to adhere to the WLP and sufficiency (although less than S is needed).

ERROR STATISTICIAN: Well I happen to be a frequentist (error statistical) philosopher; I have recently (2006) found a hole in your proof,…well I hope we can discuss it.

BIRNBAUM: Well, well, well: I’ll bet you a bottle of Elba Grease champagne that I can demonstrate it! Continue reading

Categories: Birnbaum Brakes, Statistics, strong likelihood principle | Tags: , , , | 2 Comments

To raise the power of a test is to lower (not raise) the “hurdle” for rejecting the null (Ziliac and McCloskey 3 years on)

Part 2 Prionvac: The Will to Understand PowerI said I’d reblog one of the 3-year “memory lane” posts marked in red, with a few new comments (in burgundy), from time to time. So let me comment on one referring to Ziliac and McCloskey on power. (from Oct.2011). I would think they’d want to correct some wrong statements, or explain their shifts in meaning. My hope is that, 3 years on, they’ll be ready to do so. By mixing some correct definitions with erroneous ones, they introduce more confusion into the discussion.

From my post 3 years ago: “The Will to Understand Power”: In this post, I will adhere precisely to the text, and offer no new interpretation of tests. Type 1 and 2 errors and power are just formal notions with formal definitions.  But we need to get them right (especially if we are giving expert advice).  You can hate the concepts; just define them correctly please.  They write:

“The error of the second kind is the error of accepting the null hypothesis of (say) zero effect when the null is in face false, that is, then (say) such and such a positive effect is true.”

So far so good (keeping in mind that “positive effect” refers to a parameter discrepancy, say δ, not an observed difference.

And the power of a test to detect that such and such a positive effect δ is true is equal to the probability of rejecting the null hypothesis of (say) zero effect when the null is in fact false, and a positive effect as large as δ is present.


Let this alternative be abbreviated H’(δ):

H’(δ): there is a positive effect as large as δ.

Suppose the test rejects the null when it reaches a significance level of .01.

(1) The power of the test to detect H’(δ) =

P(test rejects null at .01 level; H’(δ) is true).

Say it is 0.85.

“If the power of a test is high, say, 0.85 or higher, then the scientist can be reasonably confident that at minimum the null hypothesis (of, again, zero effect if that is the null chosen) is false and that therefore his rejection of it is highly probably correct”. (Z & M, 132-3).

But this is not so.  Perhaps they are slipping into the cardinal error of mistaking (1) as a posterior probability:

(1’) P(H’(δ) is true| test rejects null at .01 level)! Continue reading

Categories: 3-year memory lane, power, Statistics | Tags: , , | 6 Comments


3 years ago...

3 years ago…

MONTHLY MEMORY LANE: 3 years ago: December 2011. I mark in red 3 posts that seem most apt for general background on key issues in this blog.*

*I announced this new, once-a-month feature at the blog’s 3-year anniversary. I will repost and comment on one of the 3-year old posts from time to time. [I’ve yet to repost and comment on the one from Oct. 2011, but will very shortly.] For newcomers, here’s your chance to catch-up; for old timers,this is philosophy: rereading is essential!


Nov. 2011

Oct. 2011

Sept. 2011 (Within “All She Wrote (so far))

Categories: 3-year memory lane, blog contents, Statistics | Leave a comment

All I want for Chrismukkah is that critics & “reformers” quit howlers of testing (after 3 yrs of blogging)! So here’s Aris Spanos “Tallking Back!”

spanos 2014



This was initially posted as slides from our joint Spring 2014 seminar: “Talking Back to the Critics Using Error Statistics”. (You can enlarge them.) Related reading is Mayo and Spanos (2011)


Categories: Error Statistics, fallacy of rejection, Phil6334, reforming the reformers, Statistics | 27 Comments

Derailment: Faking Science: A true story of academic fraud, by Diederik Stapel (translated into English)

images-16Diederik Stapel’s book, “Ontsporing” has been translated into English, with some modifications. From what I’ve read, it’s interesting in a bizarre, fraudster-porn sort of way.

Faking Science: A true story of academic fraud

Diederik Stapel
Translated by Nicholas J.L. Brown

Nicholas J. L. Brown (
Strasbourg, France
December 14, 2014



Foreword to the Dutch edition

I’ve spun off, lost my way, crashed and burned; whatever you want to call it. It’s not much fun. I was doing fine, but then I became impatient, overambitious, reckless. I wanted to go faster and better and higher and smarter, all the time. I thought it would help if I just took this one tiny little shortcut, but then I found myself more and more often in completely the wrong lane, and in the end I wasn’t even on the road at all. I left the road where I should have gone straight on, and made my own, spectacular, destructive, fatal accident. I’ve ruined my life, but that’s not the worst of it. My recklessness left a multiple pile-up in its wake, which caught up almost everyone important to me: my wife and children, my parents and siblings, colleagues, students, my doctoral candidates, the university, psychology, science, all involved, all hurt or damaged to some degree or other. That’s the worst part, and it’s something I’m going to have to learn to live with for the rest of my life, along with the shame and guilt. I’ve got more regrets than hairs on my head, and an infinite amount of time to think about them. Continue reading

Categories: Statistical fraudbusting, Statistics | Tags: | 4 Comments

Announcing Kent Staley’s new book, An Introduction to the Philosophy of Science (CUP)


Kent Staley has written a clear and engaging introduction to PhilSci that manages to blend the central key topics of philosophy of science with current philosophy of statistics. Quite possibly, Staley explains Error Statistics more clearly in many ways than I do in his 10 page section, 9.4. CONGRATULATIONS STALEY*

You can get this book for free by merely writing one of the simpler palindrome’s in the December contest.

Here’s an excerpt from that section:



9.4 Error-statistical philosophy of science and severe testing

Deborah Mayo has developed an alternative approach to the interpretation of frequentist statistical inference (Mayo 1996). But the idea at the heart of Mayo’s approach is one that can be stated without invoking probability at all. ….

Mayo takes the following “minimal scientific principle for evidence” to be uncontroversial:

Principle 3 (Minimal principle for evidence) Data xo provide poor evidence for H if they result from a method or procedure that has little or no ability of finding flaws in H, even if H is false.(Mayo and Spanos, 2009, 3) Continue reading

Categories: Announcement, Palindrome, Statistics, StatSci meets PhilSci | Tags: | 10 Comments

S. Stanley Young: Are there mortality co-benefits to the Clean Power Plan? It depends. (Guest Post)




S. Stanley Young, PhD
Assistant Director
Bioinformatics National Institute of Statistical Sciences Research Triangle Park, NC

Are there mortality co-benefits to the Clean Power Plan? It depends.

Some years ago, I listened to a series of lectures on finance. The professor would ask a rhetorical question, pause to give you some time to think, and then, more often than not, answer his question with, “It depends.” Are there mortality co-benefits to the Clean Power Plan? Is mercury coming from power plants leading to deaths? Well, it depends.

So, rhetorically, is an increase in CO2 a bad thing? There is good and bad in everything. Well, for plants an increase in CO2 is a good thing. They grow faster. They convert CO2 into more food and fiber. They give off more oxygen, which is good for humans. Plants appear to be CO2 starved.

It is argued that CO2 is a greenhouse gas and an increase in CO2 will raise temperatures, ice will melt, sea levels will rise, and coastal area will flood, etc. It depends. In theory yes, in reality, maybe. But a lot of other events must be orchestrated simultaneously. Obviously, that scenario depends on other things as, for the last 18 years, CO2 has continued to go up and temperatures have not. So it depends on other factors, solar radiance, water vapor, El Nino, sun spots, cosmic rays, earth presession, etc., just what the professor said.

young pic 1

So suppose ambient temperatures do go up a few degrees. On balance, is that bad for humans? The evidence is overwhelming that warmer is better for humans. One or two examples are instructive. First, Cox et al., (2013) with the title, “Warmer is healthier: Effects on mortality rates of changes in average fine particulate matter (PM2.5) concentrations and temperatures in 100 U.S. cities.” To quote from the abstract of that paper, “Increases in average daily temperatures appear to significantly reduce average daily mortality rates, as expected from previous research.” Here is their plot of daily mortality rate versus Max temperature. It is clear that as the maximum temperature in a city goes up, mortality goes down. So if the net effect of increasing CO2 is increasing temperature, there should be a reduction in deaths. Continue reading

Categories: evidence-based policy, junk science, Statistics | Tags: | 35 Comments

Msc. Kvetch: What does it mean for a battle to be “lost by the media”?

IMG_17801.  What does it mean for a debate to be “media driven” or a battle to be “lost by the media”? In my last post, I noted that until a few weeks ago, I’d never heard of a “power morcellator.” Nor had I heard of the AAGL–The American Association of Gynecologic Laparoscopists. In an article Battle over morcellation lost ‘in the media’”(Nov 26, 2014) Susan London reports on a recent meeting of the AAGL[i]

The media played a major role in determining the fate of uterine morcellation, suggested a study reported at a meeting sponsored by AAGL.

“How did we lose this battle of uterine morcellation? We lost it in the media,” asserted lead investigator Dr. Adrian C. Balica, director of the minimally invasive gynecologic surgery program at the Robert Wood Johnson Medical School in New Brunswick, N.J.

The “investigation” Balica led consisted of collecting Internet search data using something called the Google Adwords Keyword Planner:

Results showed that the average monthly number of Google searches for the term ‘morcellation’ held steady throughout most of 2013 at about 250 per month, reported Dr. Balica. There was, however, a sharp uptick in December 2013 to more than 2,000 per month, and the number continued to rise to a peak of about 18,000 per month in July 2014. A similar pattern was seen for the terms ‘morcellator,’ ‘fibroids in uterus,’ and ‘morcellation of uterine fibroid.’

The “vitals” of the study are summarized at the start of the article:

Key clinical point: Relevant Google searches rose sharply as the debate unfolded.

Major finding: The mean monthly number of searches for “morcellation” rose from about 250 in July 2013 to 18,000 in July 2014.

Data source: An analysis of Google searches for terms related to the power morcellator debate.

Disclosures: Dr. Balica disclosed that he had no relevant conflicts of interest.

2. Here’s my question: Does a high correlation between Google searches and debate-related terms signify that the debate is “media driven”? I suppose you could call it that, but Dr. Balica is clearly suggesting that something not quite kosher, or not fully factual was responsible for losing “this battle of uterine morcellation”, downplaying the substantial data and real events that drove people (like me) to search the terms upon hearing the FDA announcement in November. Continue reading

Categories: msc kvetch, PhilStat Law, science communication, Statistics | 11 Comments

How power morcellators inadvertently spread uterine cancer

imagesUntil a few weeks ago, I’d never even heard of a “power morcellator.” Nor was I aware of the controversy that has pitted defenders of a woman’s right to choose a minimally invasive laparoscopic procedure in removing fibroids—enabled by the power morcellator–and those who decry the danger it poses in spreading an undetected uterine cancer throughout a woman’s abdomen. The most outspoken member of the anti-morcellation group is surgeon Hooman Noorchashm. His wife, Dr. Amy Reed, had a laparoscopic hysterectomy that resulted in morcellating a hidden cancer, progressing it to Stage IV sarcoma. Below is their video (link is here), followed by a recent FDA warning. I may write this in stages or parts. (I will withhold my view for now, I’d like to know what you think.)

Morcellation: (The full Article is here.)


FDA Safety Communication:images-1

UPDATED Laparoscopic Uterine Power Morcellation in Hysterectomy and Myomectomy: FDA Safety Communication

The following information updates our April 17, 2014 communication.

Date Issued: Nov. 24, 2014

Laparoscopic power morcellators are medical devices used during different types of laparoscopic (minimally invasive) surgeries. These can include certain procedures to treat uterine fibroids, such as removing the uterus (hysterectomy) or removing the uterine fibroids (myomectomy). Morcellation refers to the division of tissue into smaller pieces or fragments and is often used during laparoscopic surgeries to facilitate the removal of tissue through small incision sites.

When used for hysterectomy or myomectomy in women with uterine fibroids, laparoscopic power morcellation poses a risk of spreading unsuspected cancerous tissue, notably uterine sarcomas, beyond the uterus. The FDA is warning against using laparoscopic power morcellators in the majority of women undergoing hysterectomy or myomectomy for uterine fibroids. Health care providers and patients should carefully consider available alternative treatment options for the removal of symptomatic uterine fibroids.

Summary of Problem and Scope: 
Uterine fibroids are noncancerous growths that develop from the muscular tissue of the uterus. Most women will develop uterine fibroids (also called leiomyomas) at some point in their lives, although most cause no symptoms1. In some cases, however, fibroids can cause symptoms, including heavy or prolonged menstrual bleeding, pelvic pressure or pain, and/or frequent urination, requiring medical or surgical therapy.

Many women choose to undergo laparoscopic hysterectomy or myomectomy because these procedures are associated with benefits such as a shorter post-operative recovery time and a reduced risk of infection compared to abdominal hysterectomy and myomectomy2. Many of these laparoscopic procedures are performed using a power morcellator.

Based on an FDA analysis of currently available data, we estimate that approximately 1 in 350 women undergoing hysterectomy or myomectomy for the treatment of fibroids is found to have an unsuspected uterine sarcoma, a type of uterine cancer that includes leiomyosarcoma. At this time, there is no reliable method for predicting or testing whether a woman with fibroids may have a uterine sarcoma.

If laparoscopic power morcellation is performed in women with unsuspected uterine sarcoma, there is a risk that the procedure will spread the cancerous tissue within the abdomen and pelvis, significantly worsening the patient’s long-term survival. While the specific estimate of this risk may not be known with certainty, the FDA believes that the risk is higher than previously understood. Continue reading

Categories: morcellation: FDA warning, Statistics | Tags: | 7 Comments

“Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance” (Dec 3 Seminar slides)

(May 4) 7 Deborah Mayo  “Ontology & Methodology in Statistical Modeling”Below are the slides from my Rutgers seminar for the Department of Statistics and Biostatistics yesterday, since some people have been asking me for them. The abstract is here. I don’t know how explanatory a bare outline like this can be, but I’d be glad to try and answer questions[i]. I am impressed at how interested in foundational matters I found the statisticians (both faculty and students) to be. (There were even a few philosophers in attendance.) It was especially interesting to explore, prior to the seminar, possible connections between severity assessments and confidence distributions, where the latter are along the lines of Min-ge Xie (some recent papers of his may be found here.)

“Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance”

[i]They had requested a general overview of some issues in philosophical foundations of statistics. Much of this will be familiar to readers of this blog.



Categories: Bayesian/frequentist, Error Statistics, Statistics | 11 Comments

My Rutgers Seminar: tomorrow, December 3, on philosophy of statistics

picture-216-1I’ll be talking about philosophy of statistics tomorrow afternoon at Rutgers University, in the Statistics and Biostatistics Department, if you happen to be in the vicinity and are interested.


Seminar Speaker:     Professor Deborah Mayo, Virginia Tech

Title:           Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance

Time:          3:20 – 4:20pm, Wednesday, December 3, 2014 Place:         552 Hill Center


Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance Getting beyond today’s most pressing controversies revolving around statistical methods, I argue, requires scrutinizing their underlying statistical philosophies.Two main philosophies about the roles of probability in statistical inference are probabilism and performance (in the long-run). The first assumes that we need a method of assigning probabilities to hypotheses; the second assumes that the main function of statistical method is to control long-run performance. I offer a third goal: controlling and evaluating the probativeness of methods. An inductive inference, in this conception, takes the form of inferring hypotheses to the extent that they have been well or severely tested. A report of poorly tested claims must also be part of an adequate inference. I develop a statistical philosophy in which error probabilities of methods may be used to evaluate and control the stringency or severity of tests. I then show how the “severe testing” philosophy clarifies and avoids familiar criticisms and abuses of significance tests and cognate methods (e.g., confidence intervals). Severity may be threatened in three main ways: fallacies of statistical tests, unwarranted links between statistical and substantive claims, and violations of model assumptions.

Categories: Announcement, Statistics | 4 Comments

Blog at