U-Phil (Phil 6334) How should “prior information” enter in statistical inference?

On weekends this spring (in connection with Phil 6334, but not limited to seminar participants) I will post relevant “comedy hours”, invites to analyze short papers or blogs (“U-Phils”, as in “U-philosophize”), and some of my “deconstructions” of articles. To begin with a “U-Phil”, consider a note by Andrew Gelman: “Ethics and the statistical use of prior information,”[i].

RMM: "A Conversation Between Sir David Cox & D.G. Mayo"I invite you to send (to error@vt.edu) informal analyses (“U-Phil”, ~500-750 words) by February 10) [iv]. Indicate if you want your remarks considered for possible posting on this blog.

Writing philosophy differs from other types of writing: Some links to earlier U-Phils are here. Also relevant is this note: “So you want to do a philosophical analysis?”

U-Phil (2/10/14): In section 3 Gelman comments on some of David Cox’s remarks in a (highly informal and non-scripted) conversation we recorded:

 A Statistical Scientist Meets a Philosopher of Science: A Conversation between Sir David Cox and Deborah Mayo,” published in Rationality, Markets and Morals [iii] (Section 2 has some remarks on Larry Wasserman, by the way.)

Here’s the relevant portion of the conversation:

COX: Deborah, in some fields foundations do not seem very important, but we both think foundations of statistical inference are important; why do you think that is?

MAYO: I think because they ask about fundamental questions of evidence, inference, and probability. I don’t think that foundations of different fields are all alike; because in statistics we’re so intimately connected to the scientific interest in learning about the world, we invariably cross into philosophical questions about empirical knowledge and inductive inference.

COX: One aspect of it is that it forces us to say what it is that we really want to know when we analyze a situation statistically. Do we want to put in a lot of information external to the data, or as little as possible. It forces us to think about questions of that sort.

MAYO: But key questions, I think, are not so much a matter of putting in a lot or a little information. …What matters is the kind of information, and how to use it to learn. This gets to the question of how we manage to be so successful in learning about the world, despite knowledge gaps, uncertainties and errors. To me that’s one of the deepest questions and it’s the main one I care about. I don’t think a (deductive) Bayesian computation can adequately answer it.…..

COX: There’s a lot of talk about what used to be called inverse probability and is now called Bayesian theory. That represents at least two extremely different approaches. How do you see the two? Do you see them as part of a single whole? Or as very different?

MAYO: It’s hard to give a single answer, because of a degree of schizophrenia among many Bayesians. ….[I]n reality default Bayesians seem to want it both ways. They say: ‘All I’m trying to do is give you a prior to use if you don’t know anything. But of course if you do have prior information, by all means, put it in.’ It’s an exercise that lets them claim to be objective, while inviting you to put in degrees of belief, if you have them. …

COX: Yes, Fisher’s resolution of this issue in the context of the design of experiments was essentially that in designing an experiment you do have all sorts of prior information, and you use that to set up a good experimental design. Then when you come to analyze it, you do not use the prior information. In fact you have very clever ways of making sure that your analysis is valid even if the prior information is totally wrong. If you use the wrong prior information you just got an inefficient design, that’s all.

MAYO: What kind of prior, not prior probability?

COX: No, prior information, for example, a belief that certain situations are likely to give similar outcomes, or a belief that studying this effect is likely to be interesting. There would be informal reasons as to why that is the case that would come into the design, but it does not play any part in the analysis, in his view, and I think that is, on the whole, a very sound approach. Prior information is always there. It might be totally wrong but the investigator must believe something otherwise he or she wouldn’t be studying the issue in the first place.


COX: There are situations where it is very clear that whatever a scientist or statistician might do privately in looking at data, when they present their information to the public or government department or whatever, they should absolutely not use prior information, because the prior opinions on some of these prickly issues of public policy can often be highly contentious with different people with strong and very conflicting views.

MAYO: But they should use existing knowledge.

COX: Knowledge yes. Prior knowledge will go into constructing the model in the first place or even asking the question or even finding it at all interesting. It’s not evidence that should be used if let’s say a group of surgeons claim we are very, very strongly convinced, maybe to probability 0.99, that this surgical procedure works and is good for patients, without inquiring where the 0.99 came from. It’s a very dangerous line of argument. But not unknown.

MAYO: (laughs).

COX: Similar issues arise in public policy on education or criminology, or things like that. There are often very strong opinions expressed that if converted into prior probabilities would give different people very high prior probabilities to conflicting claims. That’s precisely what the scientist doesn’t want.

MAYO: Yes, I agree. …..


COX: I have often been connected with government decision-making. The idea that we would present people’s opinions unbacked by evidence would have been treated as ludicrous. We were there as scientists to supposedly provide objective information about the issue. Of course I know there is difficulty with the idea of total objectivity but at least it should connect with truth, to the goal of getting it right.

MAYO: The evidential report should be constrained by the world, by what is actually the case.

COX: Yes.

MAYO: I do find it striking that people could say with a straight face that we frequentists are not allowed to use any background information in using our methods. I have asked them to show me a book that says that, but they have not produced any. I don’t know if this is another one of those secrets shared only by the Bayesian Brotherhood.

COX: Well it’s totally ridiculous isn’t it.

MAYO: Then again, I suppose we don’t see statistical texts remedying this in a way that makes it conspicuous, that acknowledges this criticism and emphasizes that frequentists never advocated doing inference from a blank slate, but that you need to put together pieces, combine other tests and well-probed hypotheses. (We emphasize this in Cox and Mayo 2010.)

COX: Yes, you have to look at all the evidence but the main purpose of statistical analysis is to clarify what it is reasonable to learn from the specific set of limited data. It is a limited objective.


Read the full “conversation” in Rationality, Markets and Morals.

[i] From a series by Gelman on ethics in statistics (5th in a series): Ethics and the statistical use of prior information.

(ii) As recorded (with minimal editing), June, 2011. (This is a small portion of a much longer (unpublished) “conversation”; while it began with me as interviewer, here David Cox demonstrates how to be an effective interlocutor).

[iii]RMM Vol. 2, 2011, 103–114: Special Topic: Statistical Science and Philosophy of Science Edited by Deborah G. Mayo, Aris Spanos and Kent W. Staley http://www.rmm-journal.de/ This special volume grew out of a conference we organized (with others) in June 2010 at the LSE.)

This exchange was first posted on this blog here; a related, earlier discussion on Gelman’s blog is here.

Ordinary comments are, of course, welcome.

Categories: Background knowledge, Philosophy of Statistics, U-Phil | Tags: ,

Post navigation

2 thoughts on “U-Phil (Phil 6334) How should “prior information” enter in statistical inference?

  1. Here is another relevant note by Gelman: http://www.stat.columbia.edu/~gelman/research/published/p039-_o.pdf I find it interesting that he seems to embed the use of priors, in this note and the one mentioned in the post, in a broader methodology. The broader methodology requires finding “invariants” when using different sets of priors that give “reasonable” outcomes, or even using what he calls in this note “uninformative” priors. But in that case, it seems that the key question is, what is the criterion for a “reasonable” outcome in a given context?

    And I wonder whether priors don’t turn into hypotheses, or what Duhem called auxiliaries, in this broader methodology, but just aren’t called that. Overall, I find his notes somewhat unorthodox as statements of Bayesian methodology. Finding standards of “reasonability”, for instance, would seem to require either sneaking in intuitive assumptions of the sort David Cox deplores above (which I’m not sure Gelman advocates), or adopting standards of statistical significance (what he calls “acceptable” or “unacceptable” results, for instance, but also comparing results on different priors) that go beyond what I understand Bayesian methodology to be.

    • philosopherpatton: thanks for the note. It seems in sync with what I have found: priors are sensible when they can be given a clear frequent source and meaning.

      I don’t see how priors serve as Duhemian auxiliaries which are to be background and/or ceteris paribus conditions employed in deriving observable predictions.

      Which is not to say that Duhem–who favored aesthetic or simplicity criteria for theories– couldn’t himself be interpreted as a subjective (or even a conventional) Bayesian.

Blog at WordPress.com.