Statistics

Updating & Downdating: One of the Pieces to Pick up on

pieces to pick up on (later)

Before moving on to a couple of rather different areas, there’s an issue that, while mentioned by both Senn and Gelman, did not come up for discussion; so let me just note it here as one of the pieces to pick up on later.


“It is hard to see what exactly a Bayesian statistician is doing when interacting with a client. There is an initial period in which the subjective beliefs of the client are established. These prior probabilities are taken to be valuable enough to be incorporated in subsequent calculation. However, in subsequent steps the client is not trusted to reason. The reasoning is carried out by the statistician. As an exercise in mathematics it is not superior to showing the client the data, eliciting a posterior distribution and then calculating the prior distribution; as an exercise in inference Bayesian updating does not appear to have greater claims than ‘downdating’ and indeed sometimes this point is made by Bayesians when discussing what their theory implies. (59)…..” Stephen Senn

“As I wrote in 2008, if you could really construct a subjective prior you believe in, why not just look at the data and write down your subjective posterior.” Andrew Gelman commenting on Senn

I’ve even heard subjective Bayesians concur on essentially this identical point, but I would think that many would take issue with it…no?  

Categories: Statistics | Tags: , , , | 1 Comment

U-PHIL (3): Stephen Senn on Stephen Senn!

I am grateful to Deborah Mayo for having highlighted my recent piece. I am not sure that it deserves the attention it is receiving.Deborah has spotted a flaw in my discussion of pragmatic Bayesianism. In praising the use of background knowledge I can neither be talking about automatic Bayesianism nor about subjective Bayesianism. It is clear that background knowledge ought not generally to lead to uninformative priors (whatever they might be) and so is not really what objective Bayesianism is about. On the other hand all subjective Bayesians care about is coherence and it is easy to produce examples where Bayesians quite logically will react differently to evidence, so what exactly is ‘background knowledge’?. Continue reading

Categories: Philosophy of Statistics, Statistics, U-Phil | Tags: , , | Leave a comment

U-PHIL: Stephen Senn (2): Andrew Gelman

 I agree with Senn’s comments on the impossibility of the de Finetti subjective Bayesian approach.  As I wrote in 2008, if you could really construct a subjective prior you believe in, why not just look at the data and write down your subjective posterior.  The immense practical difficulties with any serious system of inference render it absurd to think that it would be possible to just write down a probability distribution to represent uncertainty.  I wish, however, that Senn would recognize “my” Bayesian approach (which is also that of John Carlin, Hal Stern, Don Rubin, and, I believe, others).  De Finetti is no longer around, but we are!
Categories: Philosophy of Statistics, Statistics, U-Phil | Tags: , , , , | 4 Comments

U-PHIL: Stephen Senn (1): C. Robert, A. Jaffe, and Mayo (brief remarks)

I very much appreciate C. Robert and A. Jaffe sharing some reflections on Stephen Senn’s article for this blog, especially as I have only met these two statisticians recently, at different conferences. My only wish is that they had taken a bit more seriously my request to “hold (a portion of) the text at ‘arm’s length,’ as it were. Cycle around it, slowly. Give it a generous interpretation, then cycle around it again self-critically” (January 13, 2011).  (I conceded it would feel foreign, but I strongly recommend it!)
Since these authors have given bloglinks, I’ll just note them here and give a few brief responses:
Categories: Philosophy of Statistics, Statistics, U-Phil | Tags: , , , | 3 Comments

RMM-6: Special Volume on Stat Sci Meets Phil Sci

The article “The Renegade Subjectivist: José Bernardo’s Reference Bayesianism” by Jan Sprenger has now been published in our special volume of the on-line journal, Rationality, Markets, and Morals (Special Topic: Statistical Science and Philosophy of Science: Where Do/Should They Meet?)

Abstract: This article motivates and discusses José Bernardo’s attempt to reconcile the  subjective Bayesian framework with a need for objective scientific inference, leading to a special kind of objective Bayesianism, namely reference Bayesianism. We elucidate principal ideas and foundational implications of Bernardo’s approach, with particular attention to the classical problem of testing a precise null hypothesis against an unspecified alternative.

Categories: Philosophy of Statistics, Statistics | Tags: , , | Leave a comment

"Philosophy of Statistics": Nelder on Lindley

A friend from Elba surprised me by sending the interesting paper and discussion of Dennis Lindley (2000), “The Philosophy of Statistics,” which I hadn’t seen in years.  She suggested, as especially apt, J. Nelder’s remarks; I recommend the full article and discussion:
(from) Comments by J. Nelder:

Recently (Nelder,1999) I have argued that statistics should be called statistical science, and that probability theory should be called statistical mathematics (not mathematical statistics). I think that Professor Lindley’s paper should be called the philosophy of statistical mathematics, and within it there is little that I disagree with. However, my interest is in the philosophy of statistical science, which I regard as different.  Statistical science is not just about the study of uncertainty but rather deals with inferences about scientific theories from uncertain data. Continue reading

Categories: Statistics | Tags: , , | 11 Comments

Mayo Philosophizes on Stephen Senn: "How Can We Cultivate Senn’s-Ability?"

Where’s Mayo?

Although, in one sense, Senn’s remarks echo the passage of Jim Berger’s that we deconstructed a few weeks ago, Senn at the same time seems to reach an opposite conclusion. He points out how, in practice, people who claim to have carried out a (subjective) Bayesian analysis have actually done something very different—but that then they heap credit on the Bayesian ideal. (See also the blog post “Who Is Doing the Work?”) Continue reading

Categories: Philosophy of Statistics, Statistics, U-Phil | Tags: , , , , | 7 Comments

“You May Believe You Are a Bayesian But You Are Probably Wrong”

The following is an extract (58-63) from the contribution by

Stephen Senn  (Full article)

Head of the Methodology and Statistics Group,

Competence Center for Methodology and Statistics (CCMS), Luxembourg

…..

I am not arguing that the subjective Bayesian approach is not a good one to use.  I am claiming instead that the argument is false that because some ideal form of this approach to reasoning seems excellent in theory it therefore follows that in practice using this and only this approach to reasoning is the right thing to do.  A very standard form of argument I do object to is the one frequently encountered in many applied Bayesian papers where the first paragraphs lauds the Bayesian approach on various grounds, in particular its ability to synthesize all sources of information, and in the rest of the paper the authors assume that because they have used the Bayesian machinery of prior distributions and Bayes theorem they have therefore done a good analysis. It is this sort of author who believes that he or she is Bayesian but in practice is wrong. (58) Continue reading

Categories: Philosophy of Statistics, Statistics | Tags: , | Leave a comment

PhilStatLaw: Bad-Faith Assertions of Conflicts of Interest?*

In response to an indication that the FDA might need to loosen conflict-of-interest (COI) rules to get sufficient experts to serve on their advisory panels, a list has been proferred of “industry-free” experts capable of serving with “clean hands”  (See Oct 10 post: Junk Science ) But why not also seek “litigation-free” experts, asks lawyer, Nathan Schachtman on his interesting blog (Dec. 28) The Continuing Saga of Bad-Faith Assertions of Conflicts of Interest:
Categories: Statistics | Tags: , , , | 5 Comments

Don’t Birnbaumize that Experiment my Friend*

(A)  “It is not uncommon to see statistics texts argue that in frequentist theory one is faced with the following dilemma: either to deny the appropriateness of conditioning on the precision of the tool chosen by the toss of a coin[i], or else to embrace the strong likelihood principle which entails that frequentist sampling distributions are irrelevant to inference once the data are obtained.  This is a false dilemma … The ‘dilemma’ argument is therefore an illusion”. (Cox and Mayo 2010, p. 298)
Continue reading

Categories: Statistics | Tags: , , , | 16 Comments

Model Validation and the LP-(Long Playing Vinyl Record)

A Bayesian acquaintance writes:

Although the Birnbaum result is of primary importance for sampling theorists, I’m still interested in it because many Bayesian statisticians think that model checking violates the likelihood principle, as if this principle is a fundamental axiom of Bayesian statistics.

But this is puzzling for two reasons. First, if the LP does not preclude testing for assumptions (and he is right that it does not[i]), then why not simply explain that rather than appeal to a disproof of something that actually never precluded model testing?   To take the disproof of the LP as grounds to announce: “So there! Now even Bayesians are free to test their models” would seem only to ingrain the original fallacy. Continue reading

Categories: Statistics | Tags: , , , | Leave a comment

JIM BERGER ON JIM BERGER!

Fortunately, we have Jim Berger interpreting himself this evening (see December 11)

Jim Berger writes: 

A few comments:

1. Objective Bayesian priors are often improper (i.e., have infinite total mass), but this is not a problem when they are developed correctly. But not every improper prior is satisfactory. For instance, the constant prior is known to be unsatisfactory in many situations. The ‘solution’ pseudo-Bayesians often use is to choose a constant prior over a large but bounded set (a ‘weakly informative’ prior), saying it is now proper and so all is well. This is not true; if the constant prior on the whole parameter space is bad, so will be the constant prior over the bounded set. The problem is, in part, that some people confuse proper priors with subjective priors and, having learned that true subjective priors are fine, incorrectly presume that weakly informative proper priors are fine. Continue reading

Categories: Irony and Bad Faith, Statistics, U-Phil | Tags: , , , | 13 Comments

Contributed Deconstructions: Irony & Bad Faith 3

My efficient Errorstat Blogpeople1 have put forward the following 3 reader-contributed interpretive efforts2 as a result of the “deconstruction” exercise from December 11, (mine, from the earlier blog, is at the end) of what I consider:

“….an especially intriguing remark by Jim Berger that I think bears upon the current mindset (Jim is aware of my efforts):

Too often I see people pretending to be subjectivists, and then using “weakly informative” priors that the objective Bayesian community knows are terrible and will give ridiculous answers; subjectivism is then being used as a shield to hide ignorance. . . . In my own more provocative moments, I claim that the only true subjectivists are the objective Bayesians, because they refuse to use subjectivism as a shield against criticism of sloppy pseudo-Bayesian practice. (Berger 2006, 463)” (From blogpost, Dec. 11, 2011) Continue reading

Categories: Irony and Bad Faith, Statistics, U-Phil | Tags: , , , | 11 Comments

The 3 stages of the acceptance of novel truths

There is an often-heard slogan about the stages of the acceptance of novel truths:

First people deny a thing.

Then they belittle it.

Then they say they knew it all along.

I don’t know who was first to state it in one form or another.  Here’s Schopenhauer with a slightly different variant:

“All truth passes through three stages: First, it is ridiculed; Second, it is violently opposed; and Third, it is accepted as self-evident.” – Arthur Schopenhauer

After recently presenting my paper criticizing the Birnbaum result on the likelihood principle (LP)[1] the reception of my analysis seems somewhere around stage two, in some cases, moving into stage three (see my blogposts of December 6 and 7, 2011). Continue reading

Categories: Statistics | Tags: , , | 2 Comments

If you try sometime, you find you get what you need!

picking up the pieces
Thanks to Nancy Cartwright, a little ad hoc discussion group has formed: “PhilErrorStat: LSE: Three weeks in (Nov-Dec) 2011.”  I’ll be posting related items on this blog, in the column to your left, over its short lifetime. We’re taking a look at some articles and issues leading up to a paper I’m putting together to give in Madrid next month on the Birnbaum-likelihood principle business (“Breaking Through the Breakthrough”) at a conference (“The Controversy about Hypothesis Testing,” Madrid, December 15-16, 2011).  I hope also to get this group’s feedback as I follow through on responses I’ve been promising to some of the comments and queries I’ve received these past few months. Continue reading
Categories: Statistics | Tags: , , , , | Leave a comment

Elbar Grease: Return to the Comedy Hour at the Bayesian Retreat

I lost a bet last night with my criminologist colleague Katrin H. It turns out that you can order a drink called “Elbar Grease” in London, in a “secret” comedy club in a distant suburb (see Sept. 30 post).[i] The trouble is that it’s not nearly as sour as the authentic drink (not sour enough, in any case, for those of us who lack that aforementioned gene). But I did get to hear some great comedy, which hasn’t happened since early days of exile, and it reminded me of my promise to revisit the “comedy hour at the Bayesian retreat” (see Sept. 3 post). Few things have been the butt of more jokes than examples of so-called “trivial intervals”. Continue reading

Categories: Statistics | Tags: , | Leave a comment

RMM-5: Special Volume on Stat Scie Meets Phil Sci

The article “Low Assumptions, High Dimensions” by Larry Wasserman has now been published in our special volume of the on-line journal, Rationality, Markets, and Morals (Special Topic: Statistical Science and Philosophy of Science: Where Do/Should They Meet?)

Abstract:
These days, statisticians often deal with complex, high dimensional datasets. Researchers in statistics and machine learning have responded by creating many new methods for analyzing high dimensional data. However, many of these new methods depend on strong assumptions. The challenge of bringing low assumption inference to high dimensional settings requires new ways to think about the foundations of statistics. Traditional foundational concerns, such as the Bayesian versus frequentist debate, have become less important.

Categories: Philosophy of Statistics, Statistics | Tags: | Leave a comment

Neyman’s Nursery (NN5): Final Post

I want to complete the Neyman’s Nursery (NN) meanderings while we have some numbers before us, and while there is a particular example, test T+, on the table.  Despite my warm and affectionate welcoming of the “power analytic” reasoning I unearthed in those “hidden Neyman” papers (see post from Oct. 22)– admittedly, largely lost in the standard decision-behavior model of tests–, it still retains an unacceptable coarseness: power is always calculated relative to the cut-off point ca for rejecting H0.  But rather than throw out the baby with the bathwater, we should keep the logic and take account of the actual value of the statistically insignificant result.

__________________________________

(For those just tuning in, power analytic reasoning aims to avoid the age-old fallacy of taking a statistically insignificant result as evidence of 0 discrepancy from the null hypothesis, by identifying discrepancies that can and cannot be ruled out.  For our test T+, we reason from insignificant results to inferences of the form:  μ < μ0 + γ.

We are illustrating (as does Neyman) with a one-sided test T+, with μ0 = 0, and α=.025.  Spoze σ = 1, n = 25, so X is statistically significant only if it exceeds .392.

Power-analytic reasoning says (in relation to our test T+):

If X is statistically insignificant and the POW(T+, μ= μ1) is high, then X indicates, or warrants inferring (or whatever phrase you like) that  μ < μ1.)

_______________________________

Suppose one had an insignificant result from test T+  and wanted to evaluate the inference:   μ < .4

(it doesn’t matter why just now, this is an illustration).

Since POW(T+, μ =.4) is hardly more than .5, Neyman would say “it was a little rash” to regard the observed mean as indicating μ < .4 . He would say this regardless of the actual value of the statistically insignificant result.  There’s no place in the power calculation, as defined, to take into account the actual observed value.1

That is why, although  high power to detect  μ as large as  μ1 is sufficient for regarding the data as good evidence that   μ < μ1 , it is too coarse to be a necessary condition.  Spoze, for example, that X = 0.

Were μ as large as .4 we would have observed a larger observed difference from the null than we did with high probability (~.98). Therefore, our data provide evidence that μ < .4.2

We might say that the severity associated with μ < .4 is high.  There are many ways to articulate the associated justification—I have done so at length earlier; and of course it is no different from “power analytic reasoning”.  Why consider a miss as good as a mile?

When I first introduced this idea in my Ph.D dissertation, I assumed researchers already did this, in real life, since it introduces no new logic.  But I’ve been surprised not to find it.

I was (and am) puzzled to discover under “observed power” the Shpower computation, which we have already considered and (hopefully) gotten past—at least for present purposes, namely, reasoning from insignificant results to inferences of the form: μ < μ0 + γ.

Granted, there are some computations which you might say lead to virtually the same results as SEV, e.g., certain confidence limits, but even so there are differences of interpretation.3  Let me know if you think I am wrong, there may well be something out there I haven’t seen….
____________
(1) This does not mean the place to enter it is in the hypothesized value of under which the power is computed (as with Shpower). This is NOT power, and as we have seen in two posts, it is fallacious to equate it to power or to power analytic reasoning. Note that the Shpower associated with X = 0 is .025—that we are interested in μ < .4 does not enter.

(2) It doesn’t matter here if we use ≤ or < .

(3) For differences between SEV and confidence intervals, see Mayo 1996, Mayo and Spanos 2006, 2011.

Categories: Neyman's Nursery, Statistics | Tags: , | Leave a comment

Logic Takes a Bit of a Hit!: (NN4) Continuing: Shpower ("observed" power) vs Power:

Logic takes a bit of a hit—student driver behind me.  Anyway, managed to get to JFK, and meant to explain a bit more clearly the first “shpower” post.
I’m not saying shpower is illegitimate in its own right, or that it could not have uses, only that finding that the logic for power analytic reasoning does not hold for shpower  is no skin off the nose of power analytic reasoning. Continue reading

Categories: Neyman's Nursery, Statistics | Tags: , , , | Leave a comment

Neyman’s Nursery (NN3): SHPOWER vs POWER

EGEK weighs 1 pound


Before leaving base again, I have a rule to check on weight gain since the start of my last trip.  I put this off til the last minute, especially when, like this time, I know I’ve overeaten while traveling.  The most accurate of the 4 scales I generally use (one is at my doctor’s) is actually in Neyman’s Nursery upstairs.  To my surprise, none of these scales showed any discernible increase over when I left.  At least one of the 4 scales would surely have registered a weight gain of 1 pound or more, had I gained it, and yet none of them do; that is an indication I’ve not gained a pound or more.  I check that each scale reliably indicates 1 pound, because I know that is the weight of the book EGEK (you can even see this on the scale shown), and they each show exactly one pound when EGEK is weighed. Having evidence I’ve gained less than 1 pound, there is even less grounds for supposing I’ve gained as much as 5 pounds, right? Continue reading

Categories: Neyman's Nursery, Statistics | Tags: , , | Leave a comment

Blog at WordPress.com.