Statistics

No headache power (for Deirdre)

670px-Relieve-a-Tension-Headache-Step-6Bullet1

.

Deirdre McCloskey’s comment leads me to try to give a “no headache” treatment of some key points about the power of a statistical test. (Trigger warning: formal stat people may dislike the informality of my exercise.)

We all know that for a given test, as the probability of a type 1 error goes down the probability of a type 2 error goes up (and power goes down).

And as the probability of a type 2 error goes down (and power goes up), the probability of a type 1 error goes up. Leaving everything else the same. There’s a trade-off between the two error probabilities.(No free lunch.) No headache powder called for.

So if someone said, as the power increases, the probability of a type 1 error decreases, they’d be saying: As the type 2 error decreases, the probability of a type 1 error decreases! That’s the opposite of a trade-off. So you’d know automatically they’d made a mistake or were defining things in a way that differs from standard NP statistical tests.

Before turning to my little exercise, I note that power is defined in terms of a test’s cut-off for rejecting the null, whereas a severity assessment always considers the actual value observed (attained power). Here I’m just trying to clarify regular old power, as defined in a N-P test.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Let’s use a familiar oversimple example to fix the trade-off in our minds so that it cannot be dislodged. Our old friend, test T+ : We’re testing the mean of a Normal distribution with n iid samples, and (for simplicity) known, fixed σ:

H0: µ ≤  0 against H1: µ >  0

Let σ = 2n = 25, so (σ/ √n) = .4. To avoid those annoying X-bars, I will use M for the sample mean. I will abbreviate (σ/ √n) as σx .

  • Test T+ is a rule: reject Hiff M > m*
  • Power of a test T+ is computed in relation to values of µ >  0.
  • The power of T+ against alternative µ =µ= Pr(T+ rejects H0 ;µ = µ1) = Pr(M > m*; µ = µ1)

We may abbreviate this as : POW(T+,α, µ = µ1) Continue reading

Categories: power, statistical tests, Statistics | 6 Comments

Blog Contents: Oct.- Dec. 2014

metablog old fashion typewriterBLOG CONTENTS: OCT – DEC 2014*

OCTOBER 2014

  • 10/01 Oy Faye! What are the odds of not conflating simple conditional probability and likelihood with Bayesian success stories?
  • 10/05 Diederik Stapel hired to teach “social philosophy” because students got tired of success stories… or something (rejected post)
  • 10/07 A (Jan 14, 2014) interview with Sir David Cox by “Statistics Views”
  • 10/10 BREAKING THE (Royall) LAW! (of likelihood) (C)
  • 10/14 Gelman recognizes his error-statistical (Bayesian) foundations
  • 10/18 PhilStat/Law: Nathan Schachtman: Acknowledging Multiple Comparisons in Statistical Analysis: Courts Can and Must
  • 10/22 September 2014: Blog Contents
  • 10/25 3 YEARS AGO: MONTHLY MEMORY LANE
  • 10/26 To Quarantine or not to Quarantine?: Science & Policy in the time of Ebola
  • 10/31 Oxford Gaol: Statistical Bogeymen

NOVEMBER 2014

  • 11/01 Philosophy of Science Assoc. (PSA) symposium on Philosophy of Statistics in the Higgs Experiments “How Many Sigmas to Discovery?”
  • 11/09 “Statistical Flukes, the Higgs Discovery, and 5 Sigma” at the PSA
  • 11/11 The Amazing Randi’s Million Dollar Challenge
  • 11/12 A biased report of the probability of a statistical fluke: Is it cheating?
  • 11/15 Why the Law of Likelihood is bankrupt–as an account of evidence
  • 11/18 Lucien Le Cam: “The Bayesians Hold the Magic”
  • 11/20 Erich Lehmann: Statistician and Poet
  • 11/22 Msc Kvetch: “You are a Medical Statistic”, or “How Medical Care Is Being Corrupted”
  • 11/25 How likelihoodists exaggerate evidence from statistical tests
  • 11/30 3 YEARS AGO: MONTHLY (Nov.) MEMORY LANE

 

DECEMBER 2014

  • 12/02 My Rutgers Seminar: tomorrow, December 3, on philosophy of statistics
  • 12/04 “Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance” (Dec 3 Seminar slides)
  • 12/06 How power morcellators inadvertently spread uterine cancer
  • 12/11 Msc. Kvetch: What does it mean for a battle to be “lost by the media”?
  • 12/13 S. Stanley Young: Are there mortality co-benefits to the Clean Power Plan? It depends. (Guest Post)
  • 12/17 Announcing Kent Staley’s new book, An Introduction to the Philosophy of Science (CUP)
  • 12/21 Derailment: Faking Science: A true story of academic fraud, by Diederik Stapel (translated into English)
  • 12/23 All I want for Chrismukkah is that critics & “reformers” quit howlers of testing (after 3 yrs of blogging)! So here’s Aris Spanos “Talking Back!”
  • 12/26 3 YEARS AGO: MONTHLY (Dec.) MEMORY LANE
  • 12/29 To raise the power of a test is to lower (not raise) the “hurdle” for rejecting the null (Ziliac and McCloskey 3 years on)
  • 12/31 Midnight With Birnbaum (Happy New Year)

* Compiled by Jean A. Miller

Categories: blog contents, Statistics | Leave a comment

Midnight With Birnbaum (Happy New Year)

 Just as in the past 3 years since I’ve been blogging, I revisit that spot in the road at 11p.m.*,just outside the Elbar Room, get into a strange-looking taxi, and head to “Midnight With Birnbaum”. I wonder if they’ll come for me this year, given that my Birnbaum article is out… This is what the place I am taken to looks like. [It’s 6 hrs later here, so I’m about to leave…]

You know how in that (not-so) recent movie, “Midnight in Paris,” the main character (I forget who plays it, I saw it on a plane) is a writer finishing a novel, and he steps into a cab that mysteriously picks him up at midnight and transports him back in time where he gets to run his work by such famous authors as Hemingway and Virginia Wolf?  He is impressed when his work earns their approval and he comes back each night in the same mysterious cab…Well, imagine an error statistical philosopher is picked up in a mysterious taxi at midnight (New Year’s Eve 2011 2012, 2013, 2014) and is taken back fifty years and, lo and behold, finds herself in the company of Allan Birnbaum.[i] There are a couple of brief (12/31/14) updates at the end.  

.

.

ERROR STATISTICIAN: It’s wonderful to meet you Professor Birnbaum; I’ve always been extremely impressed with the important impact your work has had on philosophical foundations of statistics.  I happen to be writing on your famous argument about the likelihood principle (LP).  (whispers: I can’t believe this!)

BIRNBAUM: Ultimately you know I rejected the LP as failing to control the error probabilities needed for my Confidence concept.

ERROR STATISTICIAN: Yes, but I actually don’t think your argument shows that the LP follows from such frequentist concepts as sufficiency S and the weak conditionality principle WLP.[ii]  Sorry,…I know it’s famous…

BIRNBAUM:  Well, I shall happily invite you to take any case that violates the LP and allow me to demonstrate that the frequentist is led to inconsistency, provided she also wishes to adhere to the WLP and sufficiency (although less than S is needed).

ERROR STATISTICIAN: Well I happen to be a frequentist (error statistical) philosopher; I have recently (2006) found a hole in your proof,..er…well I hope we can discuss it.

BIRNBAUM: Well, well, well: I’ll bet you a bottle of Elba Grease champagne that I can demonstrate it! Continue reading

Categories: Birnbaum Brakes, Statistics, strong likelihood principle | Tags: , , , | 2 Comments

To raise the power of a test is to lower (not raise) the “hurdle” for rejecting the null (Ziliac and McCloskey 3 years on)

Part 2 Prionvac: The Will to Understand PowerI said I’d reblog one of the 3-year “memory lane” posts marked in red, with a few new comments (in burgundy), from time to time. So let me comment on one referring to Ziliac and McCloskey on power. (from Oct.2011). I would think they’d want to correct some wrong statements, or explain their shifts in meaning. My hope is that, 3 years on, they’ll be ready to do so. By mixing some correct definitions with erroneous ones, they introduce more confusion into the discussion.

From my post 3 years ago: “The Will to Understand Power”: In this post, I will adhere precisely to the text, and offer no new interpretation of tests. Type 1 and 2 errors and power are just formal notions with formal definitions.  But we need to get them right (especially if we are giving expert advice).  You can hate the concepts; just define them correctly please.  They write:

“The error of the second kind is the error of accepting the null hypothesis of (say) zero effect when the null is in face false, that is, then (say) such and such a positive effect is true.”

So far so good (keeping in mind that “positive effect” refers to a parameter discrepancy, say δ, not an observed difference.

And the power of a test to detect that such and such a positive effect δ is true is equal to the probability of rejecting the null hypothesis of (say) zero effect when the null is in fact false, and a positive effect as large as δ is present.

Fine.

Let this alternative be abbreviated H’(δ):

H’(δ): there is a positive effect as large as δ.

Suppose the test rejects the null when it reaches a significance level of .01.

(1) The power of the test to detect H’(δ) =

P(test rejects null at .01 level; H’(δ) is true).

Say it is 0.85.

“If the power of a test is high, say, 0.85 or higher, then the scientist can be reasonably confident that at minimum the null hypothesis (of, again, zero effect if that is the null chosen) is false and that therefore his rejection of it is highly probably correct”. (Z & M, 132-3).

But this is not so.  Perhaps they are slipping into the cardinal error of mistaking (1) as a posterior probability:

(1’) P(H’(δ) is true| test rejects null at .01 level)! Continue reading

Categories: 3-year memory lane, power, Statistics | Tags: , , | 6 Comments

3 YEARS AGO: MONTHLY (Dec.) MEMORY LANE

3 years ago...

3 years ago…

MONTHLY MEMORY LANE: 3 years ago: December 2011. I mark in red 3 posts that seem most apt for general background on key issues in this blog.*

*I announced this new, once-a-month feature at the blog’s 3-year anniversary. I will repost and comment on one of the 3-year old posts from time to time. [I’ve yet to repost and comment on the one from Oct. 2011, but will very shortly.] For newcomers, here’s your chance to catch-up; for old timers,this is philosophy: rereading is essential!

Previous 3 YEAR MEMORY LANES:

Nov. 2011

Oct. 2011

Sept. 2011 (Within “All She Wrote (so far))

Categories: 3-year memory lane, blog contents, Statistics | Leave a comment

All I want for Chrismukkah is that critics & “reformers” quit howlers of testing (after 3 yrs of blogging)! So here’s Aris Spanos “Tallking Back!”

spanos 2014

.

 

This was initially posted as slides from our joint Spring 2014 seminar: “Talking Back to the Critics Using Error Statistics”. (You can enlarge them.) Related reading is Mayo and Spanos (2011)

images-5

Categories: Error Statistics, fallacy of rejection, Phil6334, reforming the reformers, Statistics | 27 Comments

Derailment: Faking Science: A true story of academic fraud, by Diederik Stapel (translated into English)

images-16Diederik Stapel’s book, “Ontsporing” has been translated into English, with some modifications. From what I’ve read, it’s interesting in a bizarre, fraudster-porn sort of way.

Faking Science: A true story of academic fraud

Diederik Stapel
Translated by Nicholas J.L. Brown

Nicholas J. L. Brown (nick.brown@free.fr)
Strasbourg, France
December 14, 2014

Derailed_Stapel_tight1

.

Foreword to the Dutch edition

I’ve spun off, lost my way, crashed and burned; whatever you want to call it. It’s not much fun. I was doing fine, but then I became impatient, overambitious, reckless. I wanted to go faster and better and higher and smarter, all the time. I thought it would help if I just took this one tiny little shortcut, but then I found myself more and more often in completely the wrong lane, and in the end I wasn’t even on the road at all. I left the road where I should have gone straight on, and made my own, spectacular, destructive, fatal accident. I’ve ruined my life, but that’s not the worst of it. My recklessness left a multiple pile-up in its wake, which caught up almost everyone important to me: my wife and children, my parents and siblings, colleagues, students, my doctoral candidates, the university, psychology, science, all involved, all hurt or damaged to some degree or other. That’s the worst part, and it’s something I’m going to have to learn to live with for the rest of my life, along with the shame and guilt. I’ve got more regrets than hairs on my head, and an infinite amount of time to think about them. Continue reading

Categories: Statistical fraudbusting, Statistics | Tags: | 4 Comments

Announcing Kent Staley’s new book, An Introduction to the Philosophy of Science (CUP)

4160cZ5qLWL._UY250_

Kent Staley has written a clear and engaging introduction to PhilSci that manages to blend the central key topics of philosophy of science with current philosophy of statistics. Quite possibly, Staley explains Error Statistics more clearly in many ways than I do in his 10 page section, 9.4. CONGRATULATIONS STALEY*

You can get this book for free by merely writing one of the simpler palindrome’s in the December contest.

Here’s an excerpt from that section:

.

Staley

9.4 Error-statistical philosophy of science and severe testing

Deborah Mayo has developed an alternative approach to the interpretation of frequentist statistical inference (Mayo 1996). But the idea at the heart of Mayo’s approach is one that can be stated without invoking probability at all. ….

Mayo takes the following “minimal scientific principle for evidence” to be uncontroversial:

Principle 3 (Minimal principle for evidence) Data xo provide poor evidence for H if they result from a method or procedure that has little or no ability of finding flaws in H, even if H is false.(Mayo and Spanos, 2009, 3) Continue reading

Categories: Announcement, Palindrome, Statistics, StatSci meets PhilSci | Tags: | 10 Comments

S. Stanley Young: Are there mortality co-benefits to the Clean Power Plan? It depends. (Guest Post)

YoungPhoto2008

.

 

S. Stanley Young, PhD
Assistant Director
Bioinformatics National Institute of Statistical Sciences Research Triangle Park, NC

Are there mortality co-benefits to the Clean Power Plan? It depends.

Some years ago, I listened to a series of lectures on finance. The professor would ask a rhetorical question, pause to give you some time to think, and then, more often than not, answer his question with, “It depends.” Are there mortality co-benefits to the Clean Power Plan? Is mercury coming from power plants leading to deaths? Well, it depends.

So, rhetorically, is an increase in CO2 a bad thing? There is good and bad in everything. Well, for plants an increase in CO2 is a good thing. They grow faster. They convert CO2 into more food and fiber. They give off more oxygen, which is good for humans. Plants appear to be CO2 starved.

It is argued that CO2 is a greenhouse gas and an increase in CO2 will raise temperatures, ice will melt, sea levels will rise, and coastal area will flood, etc. It depends. In theory yes, in reality, maybe. But a lot of other events must be orchestrated simultaneously. Obviously, that scenario depends on other things as, for the last 18 years, CO2 has continued to go up and temperatures have not. So it depends on other factors, solar radiance, water vapor, El Nino, sun spots, cosmic rays, earth presession, etc., just what the professor said.

young pic 1

So suppose ambient temperatures do go up a few degrees. On balance, is that bad for humans? The evidence is overwhelming that warmer is better for humans. One or two examples are instructive. First, Cox et al., (2013) with the title, “Warmer is healthier: Effects on mortality rates of changes in average fine particulate matter (PM2.5) concentrations and temperatures in 100 U.S. cities.” To quote from the abstract of that paper, “Increases in average daily temperatures appear to significantly reduce average daily mortality rates, as expected from previous research.” Here is their plot of daily mortality rate versus Max temperature. It is clear that as the maximum temperature in a city goes up, mortality goes down. So if the net effect of increasing CO2 is increasing temperature, there should be a reduction in deaths. Continue reading

Categories: evidence-based policy, junk science, Statistics | Tags: | 35 Comments

Msc. Kvetch: What does it mean for a battle to be “lost by the media”?

IMG_17801.  What does it mean for a debate to be “media driven” or a battle to be “lost by the media”? In my last post, I noted that until a few weeks ago, I’d never heard of a “power morcellator.” Nor had I heard of the AAGL–The American Association of Gynecologic Laparoscopists. In an article Battle over morcellation lost ‘in the media’”(Nov 26, 2014) Susan London reports on a recent meeting of the AAGL[i]

The media played a major role in determining the fate of uterine morcellation, suggested a study reported at a meeting sponsored by AAGL.

“How did we lose this battle of uterine morcellation? We lost it in the media,” asserted lead investigator Dr. Adrian C. Balica, director of the minimally invasive gynecologic surgery program at the Robert Wood Johnson Medical School in New Brunswick, N.J.

The “investigation” Balica led consisted of collecting Internet search data using something called the Google Adwords Keyword Planner:

Results showed that the average monthly number of Google searches for the term ‘morcellation’ held steady throughout most of 2013 at about 250 per month, reported Dr. Balica. There was, however, a sharp uptick in December 2013 to more than 2,000 per month, and the number continued to rise to a peak of about 18,000 per month in July 2014. A similar pattern was seen for the terms ‘morcellator,’ ‘fibroids in uterus,’ and ‘morcellation of uterine fibroid.’

The “vitals” of the study are summarized at the start of the article:

Key clinical point: Relevant Google searches rose sharply as the debate unfolded.

Major finding: The mean monthly number of searches for “morcellation” rose from about 250 in July 2013 to 18,000 in July 2014.

Data source: An analysis of Google searches for terms related to the power morcellator debate.

Disclosures: Dr. Balica disclosed that he had no relevant conflicts of interest.

2. Here’s my question: Does a high correlation between Google searches and debate-related terms signify that the debate is “media driven”? I suppose you could call it that, but Dr. Balica is clearly suggesting that something not quite kosher, or not fully factual was responsible for losing “this battle of uterine morcellation”, downplaying the substantial data and real events that drove people (like me) to search the terms upon hearing the FDA announcement in November. Continue reading

Categories: msc kvetch, PhilStat Law, science communication, Statistics | 11 Comments

How power morcellators inadvertently spread uterine cancer

imagesUntil a few weeks ago, I’d never even heard of a “power morcellator.” Nor was I aware of the controversy that has pitted defenders of a woman’s right to choose a minimally invasive laparoscopic procedure in removing fibroids—enabled by the power morcellator–and those who decry the danger it poses in spreading an undetected uterine cancer throughout a woman’s abdomen. The most outspoken member of the anti-morcellation group is surgeon Hooman Noorchashm. His wife, Dr. Amy Reed, had a laparoscopic hysterectomy that resulted in morcellating a hidden cancer, progressing it to Stage IV sarcoma. Below is their video (link is here), followed by a recent FDA warning. I may write this in stages or parts. (I will withhold my view for now, I’d like to know what you think.)

Morcellation: (The full Article is here.)

^^^^^^^^^^^^^^^^^^^

FDA Safety Communication:images-1

UPDATED Laparoscopic Uterine Power Morcellation in Hysterectomy and Myomectomy: FDA Safety Communication

http://www.fda.gov/MedicalDevices/Safety/AlertsandNotices/ucm424443.htm

The following information updates our April 17, 2014 communication.

Date Issued: Nov. 24, 2014

Product: 
Laparoscopic power morcellators are medical devices used during different types of laparoscopic (minimally invasive) surgeries. These can include certain procedures to treat uterine fibroids, such as removing the uterus (hysterectomy) or removing the uterine fibroids (myomectomy). Morcellation refers to the division of tissue into smaller pieces or fragments and is often used during laparoscopic surgeries to facilitate the removal of tissue through small incision sites.

Purpose: 
When used for hysterectomy or myomectomy in women with uterine fibroids, laparoscopic power morcellation poses a risk of spreading unsuspected cancerous tissue, notably uterine sarcomas, beyond the uterus. The FDA is warning against using laparoscopic power morcellators in the majority of women undergoing hysterectomy or myomectomy for uterine fibroids. Health care providers and patients should carefully consider available alternative treatment options for the removal of symptomatic uterine fibroids.

Summary of Problem and Scope: 
Uterine fibroids are noncancerous growths that develop from the muscular tissue of the uterus. Most women will develop uterine fibroids (also called leiomyomas) at some point in their lives, although most cause no symptoms1. In some cases, however, fibroids can cause symptoms, including heavy or prolonged menstrual bleeding, pelvic pressure or pain, and/or frequent urination, requiring medical or surgical therapy.

Many women choose to undergo laparoscopic hysterectomy or myomectomy because these procedures are associated with benefits such as a shorter post-operative recovery time and a reduced risk of infection compared to abdominal hysterectomy and myomectomy2. Many of these laparoscopic procedures are performed using a power morcellator.

Based on an FDA analysis of currently available data, we estimate that approximately 1 in 350 women undergoing hysterectomy or myomectomy for the treatment of fibroids is found to have an unsuspected uterine sarcoma, a type of uterine cancer that includes leiomyosarcoma. At this time, there is no reliable method for predicting or testing whether a woman with fibroids may have a uterine sarcoma.

If laparoscopic power morcellation is performed in women with unsuspected uterine sarcoma, there is a risk that the procedure will spread the cancerous tissue within the abdomen and pelvis, significantly worsening the patient’s long-term survival. While the specific estimate of this risk may not be known with certainty, the FDA believes that the risk is higher than previously understood. Continue reading

Categories: morcellation: FDA warning, Statistics | Tags: | 9 Comments

“Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance” (Dec 3 Seminar slides)

(May 4) 7 Deborah Mayo  “Ontology & Methodology in Statistical Modeling”Below are the slides from my Rutgers seminar for the Department of Statistics and Biostatistics yesterday, since some people have been asking me for them. The abstract is here. I don’t know how explanatory a bare outline like this can be, but I’d be glad to try and answer questions[i]. I am impressed at how interested in foundational matters I found the statisticians (both faculty and students) to be. (There were even a few philosophers in attendance.) It was especially interesting to explore, prior to the seminar, possible connections between severity assessments and confidence distributions, where the latter are along the lines of Min-ge Xie (some recent papers of his may be found here.)

“Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance”

[i]They had requested a general overview of some issues in philosophical foundations of statistics. Much of this will be familiar to readers of this blog.

 

 

Categories: Bayesian/frequentist, Error Statistics, Statistics | 11 Comments

My Rutgers Seminar: tomorrow, December 3, on philosophy of statistics

picture-216-1I’ll be talking about philosophy of statistics tomorrow afternoon at Rutgers University, in the Statistics and Biostatistics Department, if you happen to be in the vicinity and are interested.

RUTGERS UNIVERSITY DEPARTMENT OF STATISTICS AND BIOSTATISTICS www.stat.rutgers.edu

Seminar Speaker:     Professor Deborah Mayo, Virginia Tech

Title:           Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance

Time:          3:20 – 4:20pm, Wednesday, December 3, 2014 Place:         552 Hill Center

ABSTRACT

Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance Getting beyond today’s most pressing controversies revolving around statistical methods, I argue, requires scrutinizing their underlying statistical philosophies.Two main philosophies about the roles of probability in statistical inference are probabilism and performance (in the long-run). The first assumes that we need a method of assigning probabilities to hypotheses; the second assumes that the main function of statistical method is to control long-run performance. I offer a third goal: controlling and evaluating the probativeness of methods. An inductive inference, in this conception, takes the form of inferring hypotheses to the extent that they have been well or severely tested. A report of poorly tested claims must also be part of an adequate inference. I develop a statistical philosophy in which error probabilities of methods may be used to evaluate and control the stringency or severity of tests. I then show how the “severe testing” philosophy clarifies and avoids familiar criticisms and abuses of significance tests and cognate methods (e.g., confidence intervals). Severity may be threatened in three main ways: fallacies of statistical tests, unwarranted links between statistical and substantive claims, and violations of model assumptions.

Categories: Announcement, Statistics | 4 Comments

3 YEARS AGO: MONTHLY (Nov.) MEMORY LANE

3 years ago...

3 years ago…

MONTHLY MEMORY LANE: 3 years ago: November 2011. I mark in red 3 posts that seem most apt for general background on key issues in this blog.*

  • (11/1) RMM-4:“Foundational Issues in Statistical Modeling: Statistical Model Specification and Validation*” by Aris Spanos, in Rationality, Markets, and Morals (Special Topic: Statistical Science and Philosophy of Science: Where Do/Should They Meet?”)
  • (11/3) Who is Really Doing the Work?*
  • (11/5) Skeleton Key and Skeletal Points for (Esteemed) Ghost Guest
  • (11/9) Neyman’s Nursery 2: Power and Severity [Continuation of Oct. 22 Post]
  • (11/12) Neyman’s Nursery (NN) 3: SHPOWER vs POWER
  • (11/15) Logic Takes a Bit of a Hit!: (NN 4) Continuing: Shpower (“observed” power) vs Power
  • (11/18) Neyman’s Nursery (NN5): Final Post
  • (11/21) RMM-5: “Low Assumptions, High Dimensions” by Larry Wasserman, in Rationality, Markets, and Morals (Special Topic: Statistical Science and Philosophy of Science: Where Do/Should They Meet?”) See also my deconstruction of Larry Wasserman.
  • (11/23) Elbar Grease: Return to the Comedy Hour at the Bayesian Retreat
  • (11/28) The UN Charter: double-counting and data snooping
  • (11/29) If you try sometime, you find you get what you need!

*I announced this new, once-a-month feature at the blog’s 3-year anniversary. I will repost and comment on one of the 3-year old posts from time to time. [I’ve yet to repost and comment on the one from Oct. 2011, but will shortly.] For newcomers, here’s your chance to catch-up; for old timers,this is philosophy: rereading is essential!

Previous 3 YEAR MEMORY LANES:

 Oct. 2011

Sept. 2011 (Within “All She Wrote (so far))

 

 

 

 

 

 

 

 

 

 

 

Categories: 3-year memory lane, Bayesian/frequentist, Statistics | Leave a comment

How likelihoodists exaggerate evidence from statistical tests

images

I insist on point against point, no matter how much it hurts

Have you ever noticed that some leading advocates of a statistical account, say a testing account A, upon discovering account A is unable to handle a certain kind of important testing problem that a rival testing account, account B, has no trouble at all with, will mount an argument that being able to handle that kind of problem is actually a bad thing? In fact, they might argue that testing account B is not a  “real” testing account because it can handle such a problem? You have? Sure you have, if you read this blog. But that’s only a subliminal point of this post.

I’ve had three posts recently on the Law of Likelihood (LL): Breaking the [LL](a)(b)[c], and [LL] is bankrupt. Please read at least one of them for background. All deal with Royall’s comparative likelihoodist account, which some will say only a few people even use, but I promise you that these same points come up again and again in foundational criticisms from entirely other quarters.[i]

An example from Royall is typical: He makes it clear that an account based on the (LL) is unable to handle composite tests, even simple one-sided tests for which account B supplies uniformly most powerful (UMP) tests. He concludes, not that his test comes up short, but that any genuine test or ‘rule of rejection’ must have a point alternative!  Here’s the case (Royall, 1997, pp. 19-20):

[M]edical researchers are interested in the success probability, θ, associated with a new treatment. They are particularly interested in how θ relates to the old treatment’s success probability, believed to be about 0.2. They have reason to hope θ is considerably greater, perhaps 0.8 or even greater. To obtain evidence about θ, they carry out a study in which the new treatment is given to 17 subjects, and find that it is successful in nine.

Let me interject at this point that of all of Stephen Senn’s posts on this blog, my favorite is the one where he zeroes in on the proper way to think about the discrepancy we hope to find (the .8 in this example). (See note [ii]) Continue reading

Categories: law of likelihood, Richard Royall, Statistics | Tags: | 18 Comments

Msc Kvetch: “You are a Medical Statistic”, or “How Medical Care Is Being Corrupted”

1119OPEDmerto-master495A NYT op-ed the other day,”How Medical Care Is Being Corrupted” (by Pamela Hartzband and Jerome Groopman, physicians on the faculty of Harvard Medical School), gives a good sum-up of what I fear is becoming the new normal, even under so-called “personalized medicine”. 

WHEN we are patients, we want our doctors to make recommendations that are in our best interests as individuals. As physicians, we strive to do the same for our patients.

But financial forces largely hidden from the public are beginning to corrupt care and undermine the bond of trust between doctors and patients. Insurers, hospital networks and regulatory groups have put in place both rewards and punishments that can powerfully influence your doctor’s decisions.

Continue reading

Categories: PhilStat/Med, Statistics | Tags: | 8 Comments

Erich Lehmann: Statistician and Poet

Erich Lehmann 20 November 1917 – 12 September 2009

Erich Lehmann                       20 November 1917 –              12 September 2009

Memory Lane 1 Year (with update): Today is Erich Lehmann’s birthday. The last time I saw him was at the Second Lehmann conference in 2004, at which I organized a session on philosophical foundations of statistics (including David Freedman and D.R. Cox).

I got to know Lehmann, Neyman’s first student, in 1997.  One day, I received a bulging, six-page, handwritten letter from him in tiny, extremely neat scrawl (and many more after that).  He told me he was sitting in a very large room at an ASA meeting where they were shutting down the conference book display (or maybe they were setting it up), and on a very long, dark table sat just one book, all alone, shiny red.  He said he wondered if it might be of interest to him!  So he walked up to it….  It turned out to be my Error and the Growth of Experimental Knowledge (1996, Chicago), which he reviewed soon after. Some related posts on Lehmann’s letter are here and here.

That same year I remember having a last-minute phone call with Erich to ask how best to respond to a “funny Bayesian example” raised by Colin Howson. It is essentially the case of Mary’s positive result for a disease, where Mary is selected randomly from a population where the disease is very rare. See for example here. (It’s just like the case of our high school student Isaac). His recommendations were extremely illuminating, and with them he sent me a poem he’d written (which you can read in my published response here*). Aside from being a leading statistician, Erich had a (serious) literary bent. Continue reading

Categories: highly probable vs highly probed, phil/history of stat, Sir David Cox, Spanos, Statistics | Tags: , | Leave a comment

Lucien Le Cam: “The Bayesians Hold the Magic”

lecamToday is the birthday of Lucien Le Cam (Nov. 18, 1924-April 25,2000): Please see my updated 2013 post on him.

 

Categories: Bayesian/frequentist, Statistics | Leave a comment

Why the Law of Likelihood is bankrupt–as an account of evidence

slide11

.

There was a session at the Philosophy of Science Association meeting last week where two of the speakers, Greg Gandenberger and Jiji Zhang had insightful things to say about the “Law of Likelihood” (LL)[i]. Recall from recent posts here and here that the (LL) regards data x as evidence supporting H1 over H0   iff

Pr(x; H1) > Pr(x; H0).

On many accounts, the likelihood ratio also measures the strength of that comparative evidence. (Royall 1997, p.3). [ii]

H0 and H1 are statistical hypothesis that assign probabilities to the random variable X taking value x.  As I recall, the speakers limited  H1 and H0  to simple statistical hypotheses (as Richard Royall generally does)–already restricting the account to rather artificial cases, but I put that to one side. Remember, with likelihoods, the data x are fixed, the hypotheses vary.

1. Maximally likely alternatives. I didn’t really disagree with anything the speakers said. I welcomed their recognition that a central problem facing the (LL) is the ease of constructing maximally likely alternatives: so long as Pr(x; H0) < 1, a maximum likely alternative H1 would be evidentially “favored”. There is no onus on the likelihoodist to predesignate the rival, you are free to search, hunt, post-designate and construct a best (or better) fitting rival. If you’re bothered by this, says Royall, then this just means the evidence disagrees with your prior beliefs.

After all, Royall famously distinguishes between evidence and belief (recall the evidence-belief-action distinction), and these problematic cases, he thinks, do not vitiate his account as an account of evidence. But I think they do! In fact, I think they render the (LL) utterly bankrupt as an account of evidence. Here are a few reasons. (Let me be clear that I am not pinning Royall’s defense on the speakers[iii], so much as saying it came up in the general discussion[iv].) Continue reading

Categories: highly probable vs highly probed, law of likelihood, Richard Royall, Statistics | 63 Comments

A biased report of the probability of a statistical fluke: Is it cheating?

cropped-qqqq.jpg One year ago I reblogged a post from Matt Strassler, “Nature is Full of Surprises” (2011). In it he claims that

[Statistical debate] “often boils down to this: is the question that you have asked in applying your statistical method the most even-handed, the most open-minded, the most unbiased question that you could possibly ask?

It’s not asking whether someone made a mathematical mistake. It is asking whether they cheated — whether they adjusted the rules unfairly — and biased the answer through the question they chose…”

(Nov. 2014):I am impressed (i.e., struck by the fact) that he goes so far as to call it “cheating”. Anyway, here is the rest of the reblog from Strassler which bears on a number of recent discussions:


“…If there are 23 people in a room, the chance that two of them have the same birthday is 50 percent, while the chance that two of them were born on a particular day, say, January 1st, is quite low, a small fraction of a percent. The more you specify the coincidence, the rarer it is; the broader the range of coincidences at which you are ready to express surprise, the more likely it is that one will turn up.
Continue reading

Categories: Higgs, spurious p values, Statistics | 7 Comments

Blog at WordPress.com.