This is one of the questions high on the “To Do” list I’ve been keeping for this blog. The question grew out of discussions of “updating and downdating” in relation to papers by Stephen Senn (2011) and Andrew Gelman (2011) in* Rationality, Markets, and Morals.[i]*

“As an exercise in mathematics [computing a posterior based on the client’s prior probabilities] is not superior to showing the client the data, eliciting a posterior distribution and then calculating the prior distribution; as an exercise in inference Bayesian updating does not appear to have greater claims than ‘downdating’.” (Senn, 2011, p. 59)

“If you could really express your uncertainty as a prior distribution, then you could just as well observe data and directly write your subjective posterior distribution, and there would be no need for statistical analysis at all.” (Gelman, 2011, p. 77)

But if uncertainty is not expressible as a prior, then a major lynchpin for Bayesian updating seems questionable. If you can go from the posterior to the prior, on the other hand, perhaps it can also lead you to come back and change it.

**Is it legitimate to change one’s prior based on the data?**

I don’t mean update it, but reject the one you had and replace it with another. My question may yield different answers depending on the particular Bayesian view. I am prepared to restrict the entire question of changing priors to Bayesian “probabilisms”, meaning the inference takes the form of updating priors to yield posteriors, or to report a comparative Bayes factor. Interpretations can vary. In many Bayesian accounts the prior probability distribution is a way of introducing prior beliefs into the analysis (as with subjective Bayesians) or, conversely, to avoid introducing prior beliefs (as with reference or conventional priors). Empirical Bayesians employ frequentist priors based on similar studies or well established theory. There are many other variants.

S. SENN: According to Senn, one test of whether an approach is Bayesian is that while

“arrival of new data will, of course, require you to update your prior distribution to being a posterior distribution, no conceivable possible constellation of results can cause you to wish to change your prior distribution. If it does, you had the wrong prior distribution and this prior distribution would therefore have been wrong even for cases that did not leave you wishing to change it.” (Senn, 2011, 63)

“If you cannot go back to the drawing board, one seems stuck with priors one now regards as wrong; if one does change them, then what was the meaning of the prior as carrying prior information?” (Senn, 2011, p. 58)

I take it that Senn is referring to a Bayesian prior expressing belief. (He will correct me if I’m wrong.)[ii] Senn takes the upshot to be that priors cannot be changed based on data. **Is there a principled ground for blocking such moves?**

I.J. GOOD: The traditional idea was that one would have thought very hard about one’s prior before proceeding—that’s what Jack Good always said. Good advocated his device of “imaginary results” whereby one would envisage all possible results in advance (1971, p. 431) and choose a prior that you can live with whatever happens. *This could take a long time!* Given how difficult this would be, in practice, Good allowed

“that it is possible after all to change a prior in the light of actual experimental results” [but] rationality of type II has to be used.” (Good 1971, p. 431)

Maybe this is an example of what Senn calls requiring the informal to come to the rescue of the formal? Good was commenting on D. J. Bartholomew [iii] in the same wonderful volume (edited by Godambe and Sprott).

D. LINDLEY: According to subjective Bayesian Dennis Lindley:

“[I]f a prior leads to an unacceptable posterior then I modify it to cohere with properties that seem desirable in the inference.”(Lindley 1971, p. 436)

This would seem to open the door to all kinds of verification biases, wouldn’t it? This is the same Lindley who famously declared:

“I am often asked if the method gives the

rightanswer: or, more particularly, how do you know if you have got therightprior. My reply is that I don’t know what is meant by “right” in this context. The Bayesian theory is aboutcoherence, not about right or wrong.” (1976, p. 359)

H. KYBURG: Philosopher Henry Kyburg (who wrote a book on subjective probability, but was or became a frequentist) gives what I took to be the standard line (for subjective Bayesians at least):

There is no way I can be in error in my prior distribution for μ ––unless I make a logical error–… . It is that very fact that makes this prior distribution perniciously subjective. It represents an assumption that has consequences, but cannot be corrected by criticism or further evidence.” (Kyburg 1993, p. 147)

It can be updated of course via Bayes rule.

D.R. COX: While recognizing the serious problem of “temporal incoherence”, (a violation of diachronic Bayes updating), David Cox writes:

“On the other hand [temporal coherency] is not inevitable and there is nothing intrinsically inconsistent in changing prior assessments” in the light of data; however, the danger is that “even initially very surprising effects can

post hocbe made to seem plausible.” (Cox 2006, p. 78)

An analogous worry would arise, Cox notes, if frequentists permit data dependent selections of hypotheses (significance seeking, cherry picking, etc). However, frequentists (if they are not to be guilty of cheating) would need to take into account any adjustments to the overall error probabilities of the test. But the Bayesian is not in the business of computing error probabilities associated with a method for reaching posteriors. At least not traditionally. Would Bayesians even be required to report such shifts of priors? (A principle is needed.)

What if the proposed adjustment of prior is based on the data and resulting likelihoods, rather than an impetus to ensure one’s favorite hypothesis gets a desirable posterior? After all, Jim Berger says that prior elicitation typically takes place *after* “the expert has already seen the data” (2006, p. 392). Do they instruct them to try not to take the data into account? Anyway, if the prior is determined post-data, then one wonders how it can be seen to reflect information distinct from the data under analysis. All the work to obtain posteriors would have been accomplished by the likelihoods. There’s also the issue of using data twice.

**So what do you think is the answer? Does it differ for subjective vs conventional vs other stripes of Bayesian?**

[i]Both were contributions to the RMM (2011) volume: Special Topic: Statistical Science and Philosophy of Science: Where Do (Should) They Meet in 2011 and Beyond? (edited by D. Mayo, A. Spanos, and K. Staley). The volume was an outgrowth of a 2010 conference that Spanos and I (and others) ran in London, and conversations that emerged soon after. See full list of participants, talks and sponsors here.

[ii] Senn and I had a published exchange on his paper that was based on my “deconstruction” of him on this blog, followed by his response! The published comments are here (Mayo) and here (Senn).

[iii] At first I thought Good was commenting on Lindley. Bartholomew came up in this blog in discussing when Bayesians and frequentists can agree on numbers.

**WEEKEND READING**

Gelman, A. 2011. “Induction and Deduction in Bayesian Data Analysis.”

Senn, S. 2011. “You May Believe You Are a Bayesian But You Are Probably Wrong.”

Berger, J. O. 2006. “The Case for Objective Bayesian Analysis.”

Discussions and Responses on Senn and Gelman can be found searching this blog:

Commentary on Berger & Goldstein**: **Christen, Draper, Fienberg, Kadane, Kass, Wasserman,

Rejoinders**: **Berger, Goldstein,

REFERENCES

Berger, J. O. 2006. “The Case for Objective Bayesian Analysis.” *Bayesian Analysis* 1 (3): 385–402.

Cox, D. R. 2006. *Principles of Statistical Inference*. Cambridge, UK: Cambridge University Press.

Gelman, A. 2011. “Induction and Deduction in Bayesian Data Analysis.” *Rationality, Markets and Morals: Studies at the Intersection of Philosophy and Economics* 2 (Special Topic: Statistical Science and Philosophy of Science): 67–78.

Godambe, V. P., and D. A. Sprott, ed. 1971. *Foundations of Statistical Inference*. Toronto: Holt, Rinehart and Winston of Canada.

Good, I. J. 1971. Comment on Bartholomew. In *Foundations of Statistical Inference*, edited by V. P. Godambe and D. A. Sprott, 108–122. Toronto: Holt, Rinehart and Winston of Canada.

Kyburg, H. E. Jr. 1993. “The Scope of Bayesian Reasoning.” In *Philosophy of Science Association: PSA 1992*, vol 2, 139-152. East Lansing: Philosophy of Science Association.

Lindley, D. V. 1971. “The Estimation of Many Parameters.” In *Foundations of Statistical Inference*, edited by V. P. Godambe and D. A. Sprott, 435–455. Toronto: Holt, Rinehart and Winston.

Lindley, D.V. 1976. “Bayesian Statistics.” In Harper and Hooker’s (eds.)*Foundations of Probabilitiy Theory, Statistical Inference and Statistical Theories of Science*., 353-362. D Reidel.

Senn, S. 2011. “You May Believe You Are a Bayesian But You Are Probably Wrong.” *Rationality, Markets and Morals: Studies at the Intersection of Philosophy and Economics* 2 (Special Topic: Statistical Science and Philosophy of Science): 48–66.

You write “But if uncertainty is not expressible as a prior, then a major lynchpin for Bayesian updating seems questionable.” as a response to Gelmans quote. But I think the important word in Gelman’s quote is

really, as in:“If you could really express your uncertainty as a prior distribution”

I’m with Gelman that it’s not possible to

really>express your actual belief, which consists of neurons and brain matter, by a probability distribution. But that doesn’t stop you frommodelingit, using probability. In the same way as a statistical model can be usefully used to model data even if it’s notreallythe “true” model that generated the data.For me, one useful way of thinking of priors has been as representing the information one wants to enter into the model, in addition to the information coming through the likelihood; a way of incorporating data (whether it is expert elicitation or your own knowledge) that is not part of

thedata. If one, after the analysis, realizes that the information coming through the prior was off, then it could be ok to change the prior. (An extreme case would be if you’d use a sloppy uniform distribution over the mean, say Uniform(0, 100), but there is overwhelming evidence in the data that the mean is well above 100, then widening the prior and rerunning the analysis would be a reasonable decision).Rasmusab: I understand about the “really” in “If you could really express your uncertainty as a prior distribution, then ….there would be no need for statistical analysis at all.” (Gelman, 2011, p. 77). But this claim would be true only if the goal was to arrive at expressions of uncertainty. I had thought he meant this was so for subjective or personalist Bayesians. I don’t see why neurons come into it. What happens when you substitute “model your uncertainty” in his phrase. Does it hold?

Modeling variability and aspects of variable phenomenon is not the same as modeling uncertainty about variability. If the interest is in the phenomenon, there’s no need for the layer of “beliefs about”, it. Early personalist Bayesians really and truly held the view that scientific models were just modeling opinions (e.g., Edwards Lindmann and Savage) and that language has trickled down to modern Bayesianism.

Hi, Mayo. As we discuss in BDA and elsewhere, one can think of one’s statistical model, at any point in time, as a placeholder, an approximation or compromise given constraints of computation and of expressing one’s model. In many settings the following iterative procedure makes sense:

1. Set up a placeholder model (that is, whatever statistical model you might fit).

2. Perform inference (no problem, now that we have Stan!).

3. Look at the posterior inferences. If some of the inferences don’t “make sense,” this implies that you have additional information that has not been incorporated into the model. Improve the model and return to step 1.

If you look carefully you’ll see I said nothing about “prior,” just “model.” So my answer to your question is: Yes, you can change your statistical model. Nothing special about the “prior.” You can change your “likelihood” too.

Andrew: Yes, I figured you’d say this which is fine. A question is always, How do you distinguish which part of the model to “blame”?

But let’s put that aside. I want to know if you think there’s any Bayesian probabilism, perhaps subjective, where the prior is not to be changed, i.e., where Senn’s claim is correct.

Clearly, these violations of temporal coherence are not deemed trivial (to probabilists at least) in that they are regarded as disproving that updating need be by Bayes rule. It takes a big piece away from the traditional Bayesian philosophy.

Mayo:

Again, I wouldn’t single out “the prior” here. Conditional on the model, the analysis is what it is. But in practice we move on and we improve our models.

I wouldn’t blur all features of model building . I thought the point of constructing the prior was to get a posterior.

Mayo:

From a Bayesian perspective, the point of constructing the prior

and the data modelis to get a posterior. I will continue to strenuously oppose all formulations in which the data model is taken as God-given and known while the prior is considered subjective and suspect.Andrew: And I will continue to strenuously oppose all claims that frequentists take the data model as God-given and known, and I question, as well, the allegation that the statistical model, describing how the data are generated, is just as arbitrary, no more testable or comprehensible than a Bayesian prior.

Mayo:

I never said anything about “frequentists.” I just object to you talking about a subjective prior without mentioning a subjective data model.

Firstly, I was referring to subjective Bayesians who want nothing more than for the prior to represent their beliefs, degrees of uncertainty or the like. I wasn’t referring to the objectivity of a method there. One might grant them an entirely objective, operational means to elicit, approximately, these beliefs, perhaps using a conventional “completion strategy” (a term from J. Berger).

So there’s an equivocation here.

However, when it comes to speaking of an objective method for statistical inference or solving some other problem, I am not the least bit squeamish in upholding the objectivity demand, and I deny that the existence of disagreement, imperfection, and judgment entails “subjectivity” of method. That makes everything trivially subjective and so the interesting distinctions we are after disappear.

What’s going on with data models seems to me rather concrete. (At least it can be.) But when I hear of modeling my uncertainties about the variability of data, or opinions about the data, or that the variability is really in me, rather than in my method of sampling from the urn that I just set up with k% white balls, (and where I can vary the % and see exactly how it alters the experimental relative frequencies), then things are murkier.

I’m not saying you view prior probabilities as subjective, you say you do not. But I’m still interested in what you might say about the legitimacy of shifting priors in those (subjective) accounts.

Mayo:

I respect your concerns about subjectivity. All I’m saying is that, if you’re concerned about subjectivity, I recommend you be concerned about all those additive models, logistic regressions, etc etc that fill up the statistics textbooks. The prior typically a small part of a statistical model. It can be important, but if your concern is subjectivity I think you’re focused on a minor issue. Instead of “Can you change your Bayesian prior?” I recommend you ask something like, “Can you change your statistical model?”

We’ve been through all that before, and while there are very important distinctions, and the cases shouldn’t be run together, the point of this post has nothing really to do with my concerns about subjectivity/objectivity wrt modeling (even though it came up in my last response to you). I’m sorry if I wasn’t clearer. I really and truly want to know the answer, or attitude toward the question in today’s work, even limiting it to subjective (personalistic) probabilists (which would not include you, but I thought you’d know). I see different answers and wanted to inquire of my readers.

Andrew: I just saw this note (by Senn), linked to on twitter, in which he says he agrees with you that using the data to form the prior is verboten and constitutes cheating. But I don’t have the reference.

http://ba.stat.cmu.edu/journal/2008/vol03/issue03/senn.pdf

I just realized Senn must have been alluding (in this note) to Gelman’s jokey piece, so who knows which parts were intended seriously (by Gelman).

The order in which evidence becomes available does not change the conditional probability given all the evidence (e.g. p(X/A∩B∩C) = p(X/C∩B∩A). I have argued elsewhere in these blogs that a probability of replication can be calculated conditional on the numerical result of the study alone and depending on the statistical model used the probability of replication excluding ‘null’ in the direction of the result will be the same or approximately the same as 1-P. Furthermore, this probability of replication given the numerical result alone will be the same or approximately the same as the probability of the same ‘credibility interval’ by assuming uniform priors.

Further information may be used to change this initial probability (e.g. the results of other studies, evidence of bias being created by the methods etc.). The new probability could be estimated using a ‘subjective Bayesian likelihood’ or some other more ‘transparent’ method (e.g. http://blog.oup.com/2013/09/medical-diagnosis-reasoning-probable-elimination/). This is how the reader of a publication may proceed, by drawing the initial inference from the result alone. The author of course would not know the result when planning the study so the prior or initial probability of an outcome lying with some ‘result interval ‘would have to depend on the information available before the study commenced.

Given a P(lambda | K_1) we are free to consider a different P(lambda | K_2) anytime they want for whatever reason they want. What more needs to be said?

Changing your prior is illegitimate. The prior and the likelihood are exchangeable to the degree defined by the model. The prior brings stability because it is like already having some data but what statistical procedure would allow you to change your data? To change your prior you have to say something like ‘the reason I am doing this is because I have an implicit model that is much more complex than the explicit model I have formally expressed’. This, I think, is what Andrew is referring to. However, as Deborah has quoted my saying, “the informal has to rescue the formal”, which is not entirely satisfactory.

A practical problem is as follows. The inference from a prior that you know is one you would not change as prior whatever was hurled at it in the way of data, must be stronger than the inference from an apparently identical “placeholder model”. If you see there is a problem with the placeholder model and replace it, it may be that you can somehow reflect this “sensible cheating” in your posterior probabilities. However, there is a problem if you find the data suggest the model is OK and keep it. It now enters your inference as if you would never have replaced it under any circumstances. This means that your posterior inference is too strong. See my comment on Gelman and Shalizi

http://onlinelibrary.wiley.com/doi/10.1111/j.2044-8317.2012.02065.x/abstract

Of course, frequentists can also cheat in the same way but there is increasing recognition that model search strategies have to be part of the full model story and reflected in standard errors etc. This particular point is reflected, for example, in the TRIPOD guidelines :Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): The TRIPOD Statement See http://annals.org/article.aspx?articleid=2088549

I agree with Huw regarding evidence order but I am not convinced that that changes the issue being discussed here.

As regards Anonymous, the issue is whether P1(lambda¦E_1) has to be strictly updated to P2(lambda¦ E_1 & E_2) using Bayes theorem (where E1 is the evidence at the time the prior distribution was mooted and E2 is subsequent additional evidence) or whether in calculating P2 using Bayes theorem one is allowed to change P1. If one is allowed to do so, this takes us beyond any standard Bayesian account of how inference should proceed.

This is what I said on temporal coherence in the article Deborah cited

“…temporal coherence. De Finetti was adamant that it is not the world’s time, in a sense of the march of events (or the history of ‘one damn thing after another’), that governs rational decision making but the mind’s time,that is to say the order in which thoughts occur or evidence arises. However, I

do think that he believed there was no going back. You struck out the sequences of thought-events that had not occurred in your mind and renormalized. The discipline involved is so stringent that most Bayesians seem to agree that it is intolerable and there have been various attempts to show that Bayesian inference really doesn’t mean this. I am unconvinced. I think that de Finetti’s theory

really does mean this and the consequence is that the phrase ‘back to the drawing board’ is not allowed.”

Stephen: Thank you so much for clarifying this for us.

Stephen is correct formally – the math of Bayes is based on the joint model being true, and changing (or even checking) either the prior or likelihood and the math has flown. In formal deductive reasoning you are force to follow the implications.

I believe the bigger mistake is thinking one can do induction deductively – that is as hopeless as trying to build an accurate 2D map of the world. Also, why the posterior probabilities _should_ not be taken literally or at face value regardless of how many insist on doing that.

So between trying to do the impossible and something purposeful/productive I favour Andrew’s approach – especially as it mostly avoids the slow motion mathematics via his step 2.

Keith O’Rourke

Phanerono: I agree that it’s a mistake to try to do induction deductively, and that is why viewing learning as (deductive) probabilistic updating is so problematic, and yet many tout it as a great model for inductive learning.

But can you say more about your first sentence?

Maybe i should have asked Gelman when, if ever, a change in prior is objectionable.

You ask rhetorically: “What statistical procedure would allow you to change your data?” Therefore, in the interest of transparency, do you think that the ‘data’ (i.e. the evidence on which the prior probability is based) be stated explicitly?

Yes I do. In fact a major task for any Bayesian trying to interpret the views of another Bayesian is to establish which data they hold in common. Guernsey McPearson had a posts on this. See

http://www.senns.demon.co.uk/wprose.html#Priors

Thanks for this insightful post! I was struck by the analogy between adjusting one’s model in a Bayesian analysis, and running post-hoc frequentist analyses. As a non-expert, can I offer one possible suggestion about the ‘principle’ that differentiates when these moves are allowed and appropriate?

It seems to me that both post-hoc frequentist analyses and updating “priors” (or models) is problematic exactly when you want results of the analysis to trivially translate into beliefs or decisions. On the other hand, if your main goal is insight, then it can be sensible to do both of these. One trouble with many frequentist-style papers is that we are supposed to come away believing some new (usually obviously oversimplified) “result”, rather than to have used some statistical analysis over data to come to a richer understanding. This seems to me totally analogous to a subjective Bayesianism, which posits that (a) priors are beliefs, or a quite good model of beliefs, and (b) posteriors are beliefs conditional on data. Again, my beliefs are to be read off the model.

Cases where frequentist models get you insight rather than conclusions are perhaps rare in scientific practice, but totally conceivable. A common everyday case is post-hoc “what are the odds?” estimations. A very surprising coincidence happens, and you (after some calculation) say “wow! That was as likely as being struck by lightning!” One shouldn’t take this sort of calculation as literal truth, but rather as a way of grappling with one’s experience of surprise, and one’s understanding of the one world.

Of course, subjective Bayesianism is a philosophical position, while the problems with frequentist significance-worship are more of an embedded practice among scientists. But maybe these are minor differences in this context. This principle also captures why someone like Andrew Gelman can be opposed to p-hacking, but in favor of model revision: the former uses practices suitable for insight in the service of belief revision. I’m not sure how it relates to Senn’s arguments, except that it seems to me that in actual scientific practice, I do in fact change my model structure (and priors) in light of data all the time. But I hope I’m squarely on the ‘insight’ side of things anyway.

David Landy: Frequentists, or as I prefer, error statisticians, do not output answers to “what are the odds” questions, or even belief questions, but make statistical inferences and supply enough information to assess how well or poorly warranted they are. I don’t see what “richer understanding” I get when someone goes back and tinkers with things. That is why error probabilities insist on picking up on such biasing selection effects. A report on your beliefs about a phenomenon might be insightful for some purposes, but in science they need to be informative about the phenomenon. Not just about you.

Thanks for responding! As a workaday scientist (though in truth, one who these days does far more Bayesian than frequentist analysis), I disagree with your assertion that frequentists don’t calculate ‘what are the odds’ questions–though I may have been glib. It’s true that sometimes we make statistical inferences using the computations involved in some test. However, vastly more of our statistical calculations are frankly exploratory. We calculate confidence intervals all over the place to get a sense of variability in the data. We also run lots of exploratory tests from models built post-hoc, to understand how surprising our data would have been, had we been a person with this or that sort of model in mind. These are exactly analogous to a ‘what are the odds’ question–at least the way they are actually done. Although these ‘tests’ are very useful for understanding a phenomenon, they would be very bad to use as a straightforward calculation of probability under a null! One takes them as useful guides, considering the context in which they were calculated, and in that way they are quite useful. So I actually do think its ‘about me’, and I think in that way I’m reflective of a fair sweep of practice.

It strikes me that what I used to do with these kinds of pseudo-tests is a lot like what I now tend to do with exploratory Bayesian model-building exercises —I use them to get a good understanding of whether I’m adequately conceptualizing a situation I care about. Of course, in frequentist approaches (again, speaking as a practitioner), one has to make a careful separation between calculations used in this descriptive way, and ones intended as proper tests (which must be carefully planned, documented, and adhered to beforehand). Usually one sets aside data sets in advance for particular purposes (‘exploratory’ or ‘confirmatory’). It seems to me that workaday Bayesians I know have no analogous injunction: you just fiddle with the model until it feels right. In my previous post I’m suggesting that this is appropriate as long as the goal is insight; if the goal is posteriors that accurately reflect beliefs, model alteration should be forbidden just as post-hoc analyses are forbidden in frequentist approaches intended to accurately provide statistical inference proper.

(By the way, I don’t understand totally the intended difference between ‘error statistician’ and standard frequentist practice, so I’m sticking for now with the latter term as a way to pick out what I see happening in best practice situations, without making a presumption about what the error statistician would do).

“..you just fiddle with a model until it feels right…” is exactly what many people worry about.

A bit off topic, but statistical discussants tend to put far too much weight on the very small part of science that involves formal statistics. Scientists explore, model, theorize, self-correct, and gain deep causal knowledge without formal statistics, in the vast majority of fields. (Or they may just jump in and out of a split-off statistical question that arises and go back to the problem at hand.) The idea that as soon as one enters a statistical domain, somehow all of this type of information has to be wrapped up into a formal probability over an exhaustive set of hypotheses (which is what assigning a probability requires), is mistaken.

Yep, I do agree with what you say in reply. I’m very cautious about interpreting our statistical tests as anything other than exploring, modeling, theorizing, etc., the vast majority of the time. Sorry to be slightly off-topic, but I’ll bring it back: I think that the answer to the initial question is that yes, you get to modify your priors. What you don’t get to do is to assume your priors captured your belief, or that the posteriors match what you should believe in light of the data. The priors, model, and posterior hopefully together shed some light on what is going on, taken together with a process of creation on the part of the model creator.

dhlandy: Well I think I agree with what you wrote, and that the emphasis should be on articulating what’s meant by a good method to “shed some light on what is going on”. I’ve always understood error statistical methods, taken broadly, to be a piece-meal approach of that sort. However, on the way, there are still “inferences”. You want to build a strong argument from coincidence such that, taken together, some claims are well warranted and that you’d learn much less being skeptical of them (on grounds that they are fallible). I take it you’re new to the site, but my interest isn’t in defending any existing statistical repertoire. As a philosopher of statistics, I have my own.

It really does not seem there is any coherent approach that we can call “Bayesian.” Now that many are saying that priors do not need to be “prior”, then the whole philosophical basis preached by Howson, Kadane, and some others seems to be tossed. At least it seems so to me. I can see value in models that are validated against real data collected in a manner that supports inferences without hesitation. But, I perceive that the walls are coming down on a popular form of Bayesian rhetoric because of the recognition of this need to validate against experience. My read of Gelman and Shalizi and Senn supports that view. Another quick note is that post-hoc changes to a research plan using error stat models like a significance test should alter the error probabilities ( as in p-hacking, cherry-picking). If not, we say that the error probabilities are wrong (and not useful because they are wrong). Is there a correction method when the prior is changed to provide a better “fit” in a Bayesian model, such that we might say a mistake was made if this was not done?

John: Thanks for your comment. I was thinking that Senn was heading in that direction, suggesting penalizing the posterior in various ways. But you have to have a principled ground. That is what Bayesianism lacks–it can’t correct for something that is not picked up on by the formalism–unless it is embedded in a different framework (e.g., error statistical). But the need to have a Bayesian posterior over the possibilities also has to go by the board, it seems; the need to have a sum to one is standing in the way (among other things). I guess Howson and Urbach and Kadane-style Bayesianism have long been dead except in philosophical circles. I’ve often suspected Bayesianism would be a victim of its own success (in the sense that people without the care that comes from near-religious fervor could’t keep it up) , but holdovers from Savage and company have still left very deep confusions: still pervasive is the idea that I’m studying beliefs about the world.

Even more important is the failure to understand why frequentist error statistical ideas (e.g., the methodology of N-P statistics, without the behavioristic construal) shouldn’t be thrown out as irrelevant for inference. People think the choice is Bayes-1 or Bayes-2 etc. or P-values (a made up NHST animal), and real frequentist ideas are skipped over. I went through numerous Bayesian texts in writing my book recently, and nearly all of them think that citing Cox’s 1958 ex (measuring tools with different precisions), & optional stopping, & a couple of Welsh howlers refute frequentism. Too many people have bought the silliness: taking account of error probabilities =taking account of intentions, and data enter only through likelihood ratios (LP).

John and Mayo: I appreciate this conversation, which captures something I’ve felt for a while; namely that the rhetoric about the general applicability of Bayesian calculations regardless of design and experimenter considerations just doesn’t make sense, in light of the way data analysis and understanding actually work in practice. Maybe it would in a pure subjective Bayesian approach, but given that in practice we, e.g., change our priors and our data models (and that I think in many cases we ‘should’–that is, that it’s part of a sensible practice), it seems we need principles for when that sort of thing is a necessary and acceptable part of scientific exploration, and when it should cause our inferences to be treated skeptically. My baseline rule of thumb is that you really shouldn’t make any inferences without some sort of hypothesis testing apparatus (and maybe not even then), and that you shouldn’t ever take the posterior distribution seriously as capturing belief.

Another excellent post and series of comments. I agree with Stephen that changing one’s priors (instead of updating them) is not a permissible procedure, but to the extent different people might start with different priors, so what?

Enrique You seem to be saying something contradictory: I can’t change them, but I sort of can, because after all, someone else might have had the prior I want now to change to, and in fact even I might have had it.

If we’re voting then – yes, you can change your prior.

In a modelling theory I value a nice set of objects and a combination of formal and informal rules for manipulating these objects. Bayesian probabilitistic modelling as given in BDA (for example) satisfies these requirements (and gives extensive discussion of how priors enter the modelling process). It doesn’t solve the philosophical problem of induction, but neither do differential equations (as far as I’m aware).

RE: the ‘informal saving the formal’ comment above. I think this is always the case, even in mathematics. Perhaps the initial attempts at formalising Bayesian inference don’t work, but then the question (for those interested in this sort of thing) is how to improve the formal account to capture advances in the informal practice. Again, this is the case in mathematics – mathematicians didn’t stop doing mathematics when particular foundational efforts failed.

I’m not sure what you mean by voting.

It would be important to know what the prior probabilities meant, at least if one is using them in a probabilistic updating. Since I’ve heard Gelman deny he performs such Bayesian updating, this issue may not matter so much to him.

The thing is that people keep saying what we really, really, really, really want is a posterior probability in parameters or hypotheses in statistics. And the way to get them is by a prior distribution which is either a distribution of subjective beliefs (perhaps gained through betting elicitation), or one of the various conventional, intersubjective, default accounts, or some combination (e.g., depending on whether its an “interest” parameter or a “nuisance”).

If the results are very different from an assessment of how well warranted, or well tested the resulting claims are, then we may agree that quantifying “uncertainty” (in some sense) is very different goal from quantifying how precisely warranted, how severely tested, how good the evidence is, etc. (Royall was perhaps right about this much, even though his “evidential” output by comparative likelihood ratios, ignoring error probabilities, is not up to the testing job either.)

But this counts against the “what we really, really, want” refrain, so people should quit repeating that what we want is a posterior until they can demonstrate why. If they really thought about it,my guess is many would be surprised to find it’s not what they want at all. What they want is a way to ensure that mistaken interpretations are rooted out efficiently and that whatever hypotheses or models are claimed to have withstood stringent tests really are reliable.

I wouldn’t say either description of ‘what we really want’ quite captures what I’m after.

One way of thinking about it (there are a few – this is fine by me too) is that a prior defines an initial ensemble of models under consideration. The implied prior predictive distribution can be interesting in its own right even without a Bayesian update. Of course a Bayesian update is also useful as one way of passing from (or ‘filtering’) the prior collection of models to obtain the posterior collection.

I’m fine to keep this set of concepts while acknowledging we may want to consider other ways of improving the ensemble, including revising the initial collection.

Omacalren: obviously I didn’t mean that’s all we want in scientific learning, but it describes a rival philosophy/theory to the Bayesian goal of a posterior. It captures what we want in formal statistical inference in science, in a short capsule.

So what are you after in statistical inference?

> So what are you after in statistical inference?

This is probably too difficult a question to answer here.

However, as explained in a number of comments above and below, I want to reaffirm that using bayes isn’t so bad as a first approximation provided you understand what you’re doing. It does raise some interesting further philosophical points (to me), though.

As vl below (for one example) mentions (and is implicit or explicit in a number of other comments here) hierarchical bayesian models address your main question directly in at least one way, as follows.

In a model with a prior and a hyperprior, the prior is very explicitly updated by the data in a manner completely consistent with bayes. Furthermore, this addresses Senn’s point somewhat – the posterior inference is in fact weakened when incorporating a hyperprior when one marginalises over the hyperparameter. So there is no real controversy (to first approximation!) here.

The above also emphasises that the prior is part of the model – in this case the ‘prior’ in the middle of the likelihood and the hyperprior could equally well be considered ‘part of’ the likelihood or ‘part of’ the (hyper)prior, depending on how you like to multiply conditional densities together (but always associatively!).

The more philosophical issue is, I suppose, when to truncate the hierarchy. I see this as corresponding to the fact that the formal instantiation of your model is an approximate representation of your informal or semi-formal concept of ‘what you really want to represent’. If you go far enough into the hierarchy then the approximation error decreases and you don’t have to model check so much. If you truncate early then you need to model check, to verify that the formal is an approximate representation of the informal.

[It occurs to me that even the axiomatisation of arithmetic requires either an infinite string of axioms, a statement of second-order logic or an axiom schema to capture the principle of mathematical induction (surely less controversial than scientific induction!). So the occurrance of a hierarchical structure in the presence of ‘induction’ of some sort doesn’t seem unusual.]

An interesting trade-off in practice seems to relate to Popper’s whole ‘theories that explain everything explain nothing’ idea – the most general hierarchical model would tend to give you much more uncertainty and hence be ‘least wrong’ but not very informative; the simplest model would likely be wrong but informative if true. I would say that hierarchical bayes is a nice, practical way to walk the line more ‘continuously’ – decide yourself where to trade-off the probable and the informative.

Is it sensible to view “updating the prior” as model selection. Perhaps it’s formal model selection (I have a loss function that puts infinite loss in the areas of the parameter space that I think are weird) or informal (that solution doesn’t smell right), but I don’t see how this is different to how I would build a mathematical model, a frequentest procedure, or a Bayesian model: sequentially and with an eye to what I’m trying to actually do.

There is that lovely Le Cam paper (Maximum Likelihood: An Introduction) where he builds a set of principles (Principle 0: Don’t trust principles) that basically do this sort of sequential “polishing” of the model.

The relevant principles are here (although all 9 are worth reading – https://www.stat.berkeley.edu/~rice/LeCam/papers/tech168.pdf)

Principle 4. If satisfied that everything is in order, try first a crude but reliable procedure to locate the general area in which your parameters lie.

Principle 5. Having localized yourself by (4), refine the estimate using some of your theoretical assumptions, being careful all the while not to undo what you did in (4).

Dan: I always liked Le Cam, although I never met him. Here’s a post on him:

https://errorstatistics.com/2013/11/18/lucien-le-cam-the-bayesians-hold-the-magic/

All models are wrong. Some are useful. “Bayesian inference” is also a model – a model for inference.

Richard. So? Does this mean “everything goes”?

It means, you have to use your poor head and do the best you can. And understand what you are doing.

Richard: Do we need whole university depts. and scads of texts to merely tell us to do our best? I don’t think so.

I am trying to reconcile these discussions with my experience of practicing medicine and modelling the reasoning process with probability and set theory (described in my MD thesis). My conclusion was that navigating through patient care had to be based on a new theorem deduced from probability axioms via the expanded form of Bayes rule. This new theorem exploits system independence (as in simultaneous equations) as well as statistical independence.

Priors, likelihoods, sensitivities and specificities based on a simple Bayes rule are inadequate models of diagnosis to someone practicing medicine. Using this simple Bayes model for diagnosis is like trying to navigate around the world by assuming it is flat. It is the wrong model. For example, it is possible to have tests where all the likelihood ratios are one but when they are used in combination during hypothetico-deductive reasoning by probabilistic elimination, they predict a diagnosis or outcome with certainty. This reasoning process is also similar to that used for scientific hypothesis testing and inference about the reliability of observed means or proportions of multiple observations (see http://blog.oup.com/2013/09/medical-diagnosis-reasoning-probable-elimination/).

The first step in clinical medicine is to assess the reliability of the ‘facts’: the patient’s story, examination findings, a colleagues report of her or his findings, reports from laboratories etc. This is done usually by estimating the probability of replication (labs often do it formally as test result reproducibility). In science, this corresponds to estimating the reproducibility of single observation (e.g. seeing a new moon on some planet) or studies based on multiple observations (e.g. the probability of getting another mean in future within some credibility or other interval).

These discussions are predicated on the assumption that there are no other models to consider other than those based on simple Bayes rule and various related assumptions. Perhaps the discussions should include the possibility that other models (e.g. based on an expanded form of Bayes rule) may be required.

I for one have never assumed models based on simple Bayes’ rule were adequate for capturing what’s needed to arrive at reliable inferences or diagnoses.

This is a fine discussion and it obviates any comment from me. That said, my answer is the trivial one that of course you ‘can’ change your prior in light of the data, violating Bayesian orthodoxy (as Senn notes) but opening the door to many interesting analysis recipes (just as a Jew or Moslem ‘can’ eat pork…). But then, of course we should all fear manipulation of the prior to achieve a given posterior answer (prior hacking), just as we should fear P-hacking via regression-model selection (far more common than many wish to admit) or significance questing via alpha-level manipulation (very limited in practice thanks to institutionalization of cutoffs like 0.05).

A question this raises is what kind of (reform) Bayesian are you if you allow prior selection based on the data under analysis. I don’t see the option in Good’s scheme (“46656 Varieties of Bayesians”, The American Statistician, Dec. 1971) but it seems that prior-sensitivity analysis as advocated by most applied Bayesians invites the reader to do prior selection. In the extreme and to account for uncertainty about the prior, one would become empirical Bayesian and make frequentist corrections for the flexibility achieved (if one cared about calibration, as I and I think most statisticians do; see also Good IJ: “Hierarchical Bayesian and empirical Bayesian methods, The American Statistician Feb. 1987) – according to Lindley this would make you the least Bayesian of all. Fine with me, as I think the Bayesian/frequentist distinction in applied statistics is (or should be) about which tool fits the job at hand (which could be one, the other, both or neither).

I am entirely with Andrew on the Boxian point about the prior and likelihood (or in more modern terms, the penalty and estimating functions) being on equal logical footing, both being model components (subsets of the assumptions being made for deductive statistical inference). I think our agreement arises because we both have to make do with often voluminous but always questionable observational data, for which the idea of knowing the correct model is about as sound and clear as the idea of ‘knowing god’. In the kind of settings Andrew and I work in, typically little sincere effort has been given to producing models consonant with genuine, publicly verifiable external information. The notion that the prior is subjective while the likelihood is objective reflects the fact that prior components get attacked vociferously for this genuine shortcoming, whereas likelihood components get a free pass or are treated as if they rest on a solid foundation of study features. Yet in these settings the latter ‘objective’ treatment of likelihood is just a social fiction (like the idea that alpha=0.05 rests on some kind of universal loss function), and its shortcomings get dealt with in a wholly subjective manner in Discussion sections rather than in formal analyses.

I continue to hope that someday statistics can replace the overcharged ‘subjective/objective’ distinction with a continuum reflecting the amount of explicit evidence available to support the different model assumptions (whether they are packaged in the prior model or the data-generating model).

Sander: Thanks for your comment. I agree with around

~~80%~~make that 85% of it, except for the all important issue of blurring prior probability distributions with model assumptions in a non-Bayesian framework. You raise an important question:“what kind of (reform) Bayesian are you if you allow prior selection based on the data under analysis”[to matter]? [Possibly one who cares about error probabilities of the procedure.]

The prior is supposed to express evidence or beliefs about the distribution of the parameters (even when they are regarded as fixed). Where is it? The subjectivist says it’s in me and can be elicited by betting behavior. Nonsubjective (conventional) Bayesians like Berger will still do that for “interest” parameters. This entity, whether it’s the uncertainty “in me” about the values of parameters or a convention to give heaviest weight to data, or even to match alpha/beta, at the end of the day, remains cloaked in mystery (unless maybe when it’s a genuine frequentist prior). If you give us something that is supposed to measure agents’ subjective degrees of belief , you shouldn’t be surprised when it is taken as subjective (quite aside from your possibly having an objective way to elicit it,e.g., through betting behavior, which we usually don’t). Likelihoods, sampling distributions are rather clear, with restricted latitude, open to simulations and tests. Priors are touted as the best way to bring in information, which leads to Senn’s problem. If the prior is a vague shell to be firmed up by the data, then it appears the data or likelihoods are counted twice. And all for what? I agree that it should be a matter of evidence supporting model assumptions–your last point.

Thanks Mayo. Let’s see if we can whittle away at “the all important issue of blurring prior probability distributions with model assumptions in a non-Bayesian framework”.

In a non-Bayesian framework, prior probability distributions have several interpretations, among them:

(1) as penalty functions imposing a loss structure, as in Stein estimation, ridge regression, and related shrinkage procedures, and

(2) as genuine random-effect distributions that give physical frequencies with which target parameter values occur (as in problems such as predicting the long-term yield of a randomly chosen cow given its current milk yield and the distribution of long-term yields in the sampled population, or the average diastolic blood pressure of a randomly chosen patient given one blood-pressure measurement on the patient and the distribution of blood pressures in the sampled population).

In all the applications I know of, these two uses of priors have isomorphic mathematical structures (and are isomorphic to the structure of Bayesian models) so their difference must be in their physical meaning.

For (1), the penalty chosen is in the same category as the loss function for a Wald-type decision rule, so this is an assumption that the analysis audience/customer has that loss function. Example: In my field the loss associated with inferring that a relative risk RR is under 2 is almost the same for all true RR>10, leading to very light-tailed (e.g., lognormal) shrinkage of RR estimates; I can imagine however that this penalty would be wrong when ‘black swan’ catastrophes are rationally feared (e.g., when engineering space shuttles). These sorts of assumptions are indeed qualitatively distinct from those in the classical likelihood, in that they are not deduced from a model of the process under study, but rather, like alpha-levels, are supposed to be derived externally from a richer specification that involves utilities. Some would say those utilities are ‘subjective’ but I think that blurs other distinctions, and in any event ‘subjective’ carries too much psychosocial baggage.

For (2), the target parameter has a genuine frequency distribution, so here I fail to see the blurring in what Andrew or I said earlier. We are simply deploying our data about the sampled population to improve our predictive distribution (shrink the distribution of our errors) via hierachical regression (aka empirical Bayes, BLUP, etc.). That improvement is demonstrable in pure frequentist terms in sampling experiments. There is a connection to (1) via the choice of error measure, but since it may not be relevant to your objections I will leave that aside for now, and simply hope you agree that even a pure frequentist (or error statistician) should welcome prior distributions when those have a physical sampling-frequency basis (as in empirical Bayes). There is of course prior information entering this analysis but in the frequentist way, via a (compound or hierarchical) sampling model.

If you take no issue with these nonBayesian uses of priors, that leaves our divergence in the realm of the applied-Bayesian view of priors, which attempts to emulate (2) to the extent the Bayesian analysis deploys explicit information to build the prior in a public way, but which decays toward merely formulating a closing opinion (maybe someone’s, maybe no one’s) to the extent the priors are only elicitations. Nonetheless, a frequentist analysis exhibits the same bifurcation: It attempts to emulate a classical experimental analysis and succeeds to the extent it deploys explicit information to build the sampling model in a public way, but decays toward merely formulating a closing opinion to the extent the sampling model was taken off a convenient shelf (package) and perhaps tweaked a bit based on expert elicitations.

Ah but in the last case can’t the frequentist get saved from artifacts of double-counting via big-data-based fitting with calibration correction (e.g.,cross-validation)? In terms of calibration alone, yes, but then the Bayesian can deploy the same fix to achieve calibration – provided she can ‘cheat’ and modify the prior (as well as the sampling model) based on the data. We can all do that as long as we don’t worry about being excommunicated from the highest church of Bayes (P.S. I tend to doubt Bayes would have qualified for this church, any more than Jesus would have qualified for a large number of denominations claiming to represent his views – including possibly Bayes’s).

“Ah but in the last case can’t the frequentist get saved from artifacts of double-counting via big-data-based fitting with calibration correction (e.g.,cross-validation)? In terms of calibration alone, yes, but then the Bayesian can deploy the same fix to achieve calibration – provided she can ‘cheat’ and modify the prior (as well as the sampling model) based on the data. We can all do that as long as we don’t worry about being excommunicated from the highest church of Bayes ”

I think this is where the modeling using Bayes’ rule and error stats converge into the same philosophical house, which is consistent with a common understanding of what constitutes rigorous science. You place confidence in a model to the extent it has given reason to trust it, and measure the error from it in manner that provides expectations for future performance. You understand how you might be mislead.

John: No time to do justice to your comment now, but I would like to say one thing: My position is that we should run far away from any account that requires breaking the rules and cheating to be good*. On the contrary, I advocate placing front and center the admonitions against trickery and delusion. It must be very easy for people to hold statistical inferences and statistically-based claims accountable. Show me your P-value (if thats what you’re using) passes an audit: (a) any cherry-picking, p-hacking, barn hunting, optional stopping, multiple modelling, etc. enter to render to reported P-value different from the approximate actual one? (b) Violated model assumptions?. After the audit comes the question of whether it’s merely an isolated effect or one that followed Fisher’s requirement to show you know how to bring about genuine effects. Next fallacies of rejection: (i) alleged discrepancy larger than warranted? (ii) effect is real but it fails to warrant substantive research claim; and fallacies of nonsignificance: which discrepancies from null are well and poorly ruled out ?

In short, rather than a wink and a nod (we know how to do it right by breaking the rules that we wrote down), the rules should be geared to make it easier to be responsible, and for others to hold you responsible. Statistical reports should put potential flaws up front, demonstrate they’ve bent over backwards to give their account a hard time, publish papers on how not to go about investigating such and such, and give prizes to papers that show the clever new method you’ve found to get around obstacles of research.

*Senn alludes somewhere to “being good rather than perfect”, and I always remember Good saying (quoting someone else) something like: a good/lucky Bayesian can do better than a good frequentist, but a bad Bayesian gets clobbered. Our statistical philosophy should match a post-crash mentality to keep the experts and robots accountable.

Sander:

I will study your comment carefully later, but the issue isn’t whether Bayesian priors can have non-Bayesian interpretations*, it’s whether the Bayesian modes of inference now on offer (which I admit may be radically different from anything Gelman recommends since he says he opposes reports of Bayesian posteriors as well as Bayes ratios and factors) are preferable or even good ways to carry out inductive inference in science. That is a modest but pressing concern. To me, adequate inductive inference in science requires assessing and controlling error probabilities associated with inferences (be they tests or estimates or something else). The “Bayesian reforms” and Likelihoodist reforms being touted as methods to replace frequentist (error statistical) hypotheses testing and confidence intervals do not satisfy this minimal requirement. In fact many of the methods are advertised as not having to worry about cherry picking, optional stopping, significance seeking and the rest. Having knocked down straw men versions of frequentist methods, they dangle these goodies which permit you to pick an alternative hypothesis plus a prior to find comparative support wherever you like. If in addition to that latitude, priors may be changed at will, it’s just further evidence of lack of error control.

*On the point of getting frequentist matching,you might know this article by Don Fraser in which he shows the limits to which one can expect Bayesian posteriors to even give what he calls “quick and dirty confidence”.

https://errorstatistics.files.wordpress.com/2014/06/fraser_is-bayes-posterior-just-quick-and-dirty-confidence.pdf

When I saw the title of Deborah’s post the other day, my immediate reaction was, “Of course one can change one’s prior.” And why not? To me, a frequentist, injecting “feelings” into one’s analysis (as usual, I’m excluding empirical Bayes) makes the whole analysis informal. If one’s feelings change, then fine, recompute the prior. Do I think it’s desirable? No, but since I disagree with setting subjective priors in the first place, personally I don’t think any of it is desirable. I’m merely saying I don’t see setting a new prior as violating the Bayesian philosophy.

For that matter, there is nothing to prevent one from setting

multiplepriors right from the beginning. “Me, I think that theta has a normal distribution, but my friend Deborah often uses gamma priors, and you know what, she might be right, so while I’m at it, I’ll set two priors and compute two posteriors.”As far as I can tell, neither one of these actions would be counter to the Bayesian philosophy. But that qualifier (“as far as I can tell”) must be kept in mind. For instance, I’ve never understood the statement made by some that the Bayesian approach is “coherent.” Maybe revising one’s prior, or setting multiple priors, violates that.

When I was recently reading the part of Matloff’s excellent stat textbook wherein he speaks of Bayesian priors as an agent’s “feelings,” I couldn’t help but find myself writing lyrics* to the song with that title, which I thought were pretty good, but Matloff seemed to think the beats were off. I don’t mean to interject jocularity into the discussion, but his use of “feelings” in a textbook cracked me up.

*Given my poetic background

This is a very interesting posting and high quality discussion. Because so much is already written, I’m not sure whether I have anything to say that can’t be found in the previous contributions already (which I read, but potentially too quickly). Particularly the posts of Stephen Senn and Sander Greenland had much of what I think about this, but anyway, here is a sketch of my view:

1) Regarding de Finettian subjective Bayes, the whole probability calculus is derived based on coherence. If you are allowed to change your prior based on the data, coherence can be violated (how to elaborate this depends on the exact way how coherence and betting systems are formalised and it is probably possible to formalise it in such a way that you can’t directly show that coherence is violated anymore but it rather becomes unclear how to preserve and check coherence under such a move, which is not a progress as far as I’m concerned; anyway, that’s a distraction).

2) I think that the same holds for axiomatic “objective” Bayes, but again how to elaborate this depends on the exact formalisation and may be somewhat intransparent; as far as I can see, most ways of introducing “objective Bayes” rely on a clear distinction of information being available before the data come in (or are looked at) and information in the data, which is confused if the prior is modified based on the data.

3) As written before by others, I think that both subjective as well as (axiomatic) objective Bayes should be treated as models, and sometimes models turn out to be bad or unhelpful, in which case people may want to modify them. This will often violate the letter of the law, including changing the prior. Is this a good/correct thing to do or not?

4) On one hand I think that the dogmatic answer “no, it’s forbidden” is not good enough. On the other hand, one shouldn’t forget that there were some reasons in the first place that led to “the letter of the law” as it is. Violating this kind of rule has implications.

4.1 Firstly, it may make some reasons invalid that some people have for doing Bayesian analyses. If you think that coherence is a major feature and advantage of Bayesian statistics, you better don’t change your prior based on the data. When doing that, the issue rather is how things can be improved sacrificing coherence and whether that’s worthwhile. This needs to be argued but often isn’t (although it isn’t so much of a problem for Bayesians who don’t worry much about coherence).

4.2 The major problem is one that has an analogue in frequentist/error statistics. Usually, when changing the prior based on the data, something is done that theoretically is not very well understood. The theory assumes that the prior is left alone. I think that a proper analysis of what is implied by changing the prior based on the data would be to look at well-defined combined procedures that formalise how the prior is potentially changed in the light of the data, and it would then look at the Bayesian and/or frequentist properties of such a combined procedure. I haven’t seen

anything of this kind yet. In error statistical analyses, the analogue would be to analyse combined procedures involving, for example, model misspecification tests first and model-based inference conditionally on the results of the model checking. There is a bit of work in this direction, although chances are that many things of this kind that people do are not so well understood. A Gelman-style Bayesian could potentially change both the (parameter) prior and the sampling model based on the data, which gives more flexibility, but makes the understanding of the resulting combined procedures even more difficult. (I don’t have “technical” objections to what Gelman wrote here, but he didn’t mention that if indeed all aspects of the model including the prior are potentially changed in the light of the data, it becomes a harder task to understand exactly what this does to the analyses compared with the task of the frequentist or error statistician who may be prepared to fiddle with the sampling model but doesn’t have a prior to worry about on top of it.)

5) I think that there is no particular philosophical problem about changing the prior based on the data following Gelman’s Bayesian philosophy (as in the Gelman and Shalizi paper, which in some work with Gelman we call “falsificationist Bayes”), because the interpretation of the prior is neither de Finetti-style “subjective belief” with coherence nor (axiomatic) objective. This of course implies that posteriors also cannot be interpreted according to these traditional approaches; on the other hand one could define error probabilities of the resulting procedures. The problem is rather the one outlined in 4.2, namely that it’s hard to understand what the consequences are of modifying the priors.

Like in frequentist misspecification testing, arguments may probably often involve intuition of the kind “doing some data based model changes without explicitly modelling and analysing how this is done is probably by far not as bad as working with a really bad model without realising and changing it”. Chances are it’s usually OK (in the sense of “better than not doing it”) where indeed a bad model is changed into a much better one, but harmful when getting into a habit of doing such things all too easily and routinely.

Christian:

Thank you for adding your comment. I actually think you’ve made some very distinct points, which I’m glad to have, especially in that you take changing the prior based on the data to be problematic for conventional (nonsubjective, default) Bayesians as well as for subjective Bayesians. With conventional Bayesians, I was less sure because the prior is often regarded as some kind of undefined entity whose use is to provide a posterior, maximizing the input of the data in some sense. (J. Berger does claim they also somehow convey the beliefs of a group). But I find conventionalist Bayesians to be schizophrenic in that they also suggest the conventional prior is a stopping off point until you have more information. So whose to say the data didn’t just supply that? On the other hand, he’s very concerned about “double counting” even though I don’t think he ever defines pejorative double counting, nor say why it’s pejorative. I know he has that paper with Bayarri criticizing Gelman’s posterior predictive p-value as guilty of double counting. Unfortunately, I’m unable to really keep the distinctions between 5 or 6 different Bayesian P-values straight.

“In error statistical analyses, the analogue would be to analyse combined procedures involving, for example, model misspecification tests first and model-based inference conditionally on the results of the model checking. “

Maybe, but since we’re not in the business of supplying a posterior, which has to sum up to 1, and which would change as the “catchall hypothesis” changes (as Barnard points out), and given we’re not trying to give “science-wise” error probabilities, but just trying to arrive at an adequate model for making statistical inferences with approximately correct error probabilities in the case at hand, we are positioned to split things off. But I agree that the way some people do model testing could introduce changes that would need to be taken account of.

“one shouldn’t forget that there were some reasons in the first place that led to “the letter of the law” as it is. Violating this kind of rule has implications.

4.1 Firstly, it may make some reasons invalid that some people have for doing Bayesian analyses. If you think that coherence is a major feature and advantage of Bayesian statistics, you better [not] change your prior based on the data.”

Exactly.

I am overcoming my general distaste for blogcentric discussion to contribute a few remarks.

The essence of subjective Bayesianism is *coherence*, which means just what it says: making sure that my expressions of uncertainty in different, real or hypothetical, situations, fit properly together, according to the precepts of probability theory. For me, an ideal subjectivist analysis would involve considering what it might be reasonable for me to believe in a variety of particular cases (e.g. after seeing data of some extreme kind), and then adjusting these hypothetical posteriors, as required, to be mutually coherent (by ensuring that they could all be derived from a common prior), at the same time without grossly violating my informal assessments. With that working prior I can now analyse the data I actually have. In this approach the prior is constructed, not regarded as given in advance. (As an aside, this construction is not available to “objective” Bayesian, who do not require coherence).

I emphasise “ideal” above, because in all but the most critical applications, and especially in those many cases where the posterior is fairly insensitive to the prior, this amount of effort will simply not be worthwhile, and a simpler working prior will do.

Now, however much effort, large or small, has been put into assessing/constructing a working prior (indeed, the whole working prior-cum-statistical-model configuration), it can never be more than an approximation to something deeper, perhaps forever out of reach. And even that deeper something would only be a description of my psychological state. How does/should it relate to the outside world?

I believe we should regard any asserted prior-cum-model like any other tentative theory of the world, and (I hope you will like this, Deborah!) subject it to severe testing, along Popperian lines. I have described one approach to this in my paper: Dawid, A. P. (2004). Probability, causality and the empirical world: A Bayes-de Finetti-Popper-Borel synthesis. Statistical Science 19, 44-57. DOI:10.1214/088342304000000125.

What to do if the model fails the test? Abandon it, and look for something better. This might be a matter of “changing the prior”, but is more likely to require changing the statistical model. How to do this? I don’t know — I think we just don’t have any useful formal theory to help us here. Am I downhearted? No! The best we can do is do the best we can. And that’s what we should do.

Philip

Thanks Philip:

I sometimes put this in terms of censorship versus better information processing, always favoring the latter – “The best we can do is do the best we can. And that’s what we should do.”

Rules/formalisms can block inquiry (Peirce’s first rule for what not to do in scientific inquiry).

The challenge remains in making it clear to others what you [fallibly] did, why and how [you think] that is likely to be less wrong empirically and if persisted in increasingly less wrong with further inquiry.

Keith O’Rourke

apdawid: I appreciate your overcoming your general distaste for blogcentric discussion! I hope that having changed your prior for contributing that you will do so again in the future*.

“For me, an ideal subjectivist analysis would involve considering what it might be reasonable for me to believe in a variety of particular cases (e.g. after seeing data of some extreme kind), and then adjusting these hypothetical posteriors, as required, to be mutually coherent…With that working prior I can now analyse the data I actually have. In this approach the prior is constructed, not regarded as given in advance. (As an aside, this construction is not available to “objective” Bayesian, who do not require coherence).

I thought such a construction is what IJ Good recommended be completed pre-data (except for rare type II violations). You agree, though, that what you’re recommending differs from the standard subjective Bayesian story? OK, so let’s see if this is right: we first consider a variety of extreme outcomes, find posteriors reflecting what you’d believe were you to have observed any of those, making sure they could all arise from a common prior somewhat in sync with prior beliefs, and then you analyze the actual data, which may or may not have been the basis for the imaginary variations of data? In this way you escape Senn’s problem (I think). Wouldn’t the model also be considered, thereby not being so different from the “objective” Bayesian whom you say is precluded from using your construction method? Senn has alluded to your ‘changing’ priors that were first entertained in the case of a type of selection. https://errorstatistics.com/2013/12/03/stephen-senn-dawids-selection-paradox-guest-post/

Yes, I do like that you’d then “subject it to severe testing”, but you’d need a distinct statistical account for testing such constructions, and one wonders why you didn’t start there to begin with. I will read your paper. Thank you very much for your comment.

*Or even better, writing a guest post.

I attempt some notes at https://djmarsay.wordpress.com/notes/interpreting-bayes-rule/ . Briefly, I attempt to distinguish between the mathematical and pseudo-scientific aspects of Bayes’ Rule. Mathematically, one should not apply Bayes’ rule without attending to the validity of its assumptions. (A good general rule!) In this case, one should not continue with inferences based on a prior (or other model aspects) once one has good reason to doubt them. Of course, there are dangers in just changing one’s prior (or model) and carrying on. The general solution, I think, is to try to engineer a situation in which one can apply the theory with a clear conscience, in this case by gathering more data. If this is not possible then one has non-Bayesian uncertainty.

To me the most important attribute of a method is the honesty of its practitioners about its limitations. Keep up the blog!

I would like to try to answer the question “Is it legitimate to change one’s prior based on the data?” from an scientific point of view. By “scientific” I mean here a fallible but ideally objective process that is corrected by empirical evidence. For science to work, the claims that one makes should be well justified and reported in a such way that other can judge them by themselves.

In the same sense, there exists a “metaprior” that must be well justified and communicated in a way that others can understand the it. No other restrictions need to apply to the metaprior. To perform a coherent statistical analysis, a formal prior is then extracted from the metaprior. The prior is formal in the sense that it is a well defined mathematical object. As the prior is much more restricted than the metaprior, all sorts of approximations and mathematical conveniences come in to play at this point. As long as the metaprior is well documented, anybody can judge for themselves if the prior is a good representation of the metaprior. All of the above applies equally to the likelihood function. It too is based on choices and assumptions made by the scientist. Choices and assumptions that must be recorded and communicated.

If this is what Stephen Senn means with “the informal has to rescue the formal”; that behind the formal statistical process is an informal on, then I agree with him. The formal mathematical language is precise but behind it there is an informal understanding. There is nothing to be ashamed of here, progress in mathematics is often the formalization or making precise of useful ideas.

But back to the original question. If at some point (perhaps when looking at the posterior or learning something new that needs to be incorporated into the model) we realize that the prior is not good, we go back to the metaprior and produce a new prior that better reflects reality. Since the metaprior is documented we can assure ourselves that we are not cheating. In short: changing the prior based on only the data is wrong, but formalizing the metaprior in a different way is totally OK.

A resume: I agree with Lindley that coherence is essential, I agree with Kyburg that the metaprior should not be changed lightly, I agree with Gelman that the model is a placeholder conditional on many things, I agree with Senn that every posterior inference is too strong (but we strive to make them better), I agree with Savage that the metaprior (and the prior) reflects our beliefs about the world, I agree with Mayo that “statistical reports should put potential flaws up front” and I mostly agree with everything Good and Dawid said. 🙂

Can we agree on an example where it is not legitimate to not change your priors based on the data? I made some attempts on my blog, as above.

Henrik: You say Bayesian coherence is essential. Why? Any method that takes error probabilities into account is “incoherent” by Bayesian lights, So all of frequentist error statistics is “incoherent” and happy to be so. Else you fail to distinguish directly between cases where there has been optional stopping, cherry-picking, data-dependent selection effects of all sorts. More formally, it corresponds to our rejecting the likelihood principle (LP) which follows from likelihoodist inference and inference by Bayes theorem. P-values, type 1 and 2 error probabilities, power, all violate the LP. So, despite the nice-sounding name, “coherence” is scarcely desirable, at least from a frequentist point of view. Ironically, most of today’s Bayesianism is also incoherent, e.g.,conventional (or default, or reference, or O) Bayesianism is incoherent because the priors for the same hypotheses will change with the choice of model. However, even though good error control entails violating the LP, the converse does not hold. That is, violating coherence in conventionalist Bayesian does not meant you satisfy error statistical goals. Of course, in practice, even subjective Bayesians fail to be coherent, but that is a matter of human limitation rather than being packed into the formalism, which is supposed to be coherent.

I take it you’re new to the blog, so if you’re interest in the likelihood principle, search LP or SLP (strong likelihood principle) on this blog.

Thank you for the welcome! I have been following this blog for a while, but this is the first time I open my mouth. I have been chewing on the LP, the tacking paradox and other topics, but I have not been able to form a coherent opinion yet.

I came to statistics from mathematics. If you are not coherent then you have nothing. On the other hand, I don’t think any dogma (or paradigm, if you will) will save you from cherry-picking, data-dependent selection effects and all sorts of cheating. I make no claim to know what “most of today’s Bayesianism” is up to. However, by reporting the reasoning that lead up to the model (prior, likelihood and other assumptions), I declare my baggage and from there on embark on a journey in an vessel with formal, coherent structure. I don’t have to be 100% coherent in person to use coherent tools or produce a coherent result.

Henrik: Don’t confuse ordinary consistency with the very special notion of Bayesian coherency of inference, as based on the LP: the import of the data (in model-based inference) come through the likelihoods alone.

I one coheres to mathematical Bayesianism then one will not claim that anything that goes beyond the mathematics – as some ‘Bayesians’ do – is valid. It is those who advocate Bayesian methods as being universal, rather than conditional on the axioms, who are not adhering to the ‘true’ Bayesianism, as I understand it.

What is true Bayesianism?

John, As a mathematician I consider practice that goes beyond that warranted by the mathematics not to be true (to the mathematics). An analogue is Geometry. ‘True’ Geometry makes no claims about physics, although for centuries the mathematics and the physics were confused. The papers at http://www.economics-ejournal.org/special-areas/special-issues/special-issue-on-radical-uncertainty-and-its-implications-for-economics – including one of mine – claim that this distinction is also important in economics, e.g. in 2008. Comments are invited there on these broader issues. (My paper also has an Einstein quote about Geometry that may apply to Bayesianism.)

The problem here is that you’re taking toy models too literally. Toy models are useful to learn from, but they can be problematic if you mistake it for the reality of statistical practice. The toy model that is being taken too literally here is a single stage prior + likelihood leading to a posterior.

This is an inadequate model of almost anything but the cleanest of experiments. In reality, you need to think about things like experiment level variation or population level variation or biases particular to your study. So in a practical model, you’d likely arrive at a conclusion looking across multiple datasets, with a prior inferred from data and a hyperprior on top. The parameters for the prior will be adjusted by the data and the inference is still 100% coherent.

What happens when the top level prior is reassessed is that there is an implicit hyperprior above the top level prior of the model that was not explicitly incorporated but understood to exist. Thus one has to manually adjust the prior due to the model simplification resulting in a misspecification. This isn’t some deep philosophical issue, just plain old practicalities of models as an approximation.

Here’s a link to Dawid’s article: Dawid, A. P. (2004). Probability, causality and the empirical world: A Bayes-de Finetti-Popper-Borel synthesis. Statistical Science 19, 44-57.

https://errorstatistics.files.wordpress.com/2015/06/dawid-20041.pdf

It would seem to me that the reason for sticking to the rules of coherence is to make the resulting Bayesian posterior probabilities as helpful as possible. The test for this is to examine how well calibrated and informative they are (in the sense that a method that always gives a probability of 0.5 that is well calibrated is not very informative). So, this means conducting empirical studies to assess the effect of using different priors and different methods of arriving at them.

As far as I know, Dawid’s instrumentalist view is shared by all mathematicians who have thought about it at all deeply, and I regard it as ‘true Bayesianism’. It leaves open Bayes’ own question, as to when the instrument is appropriate. Some maintain always. In doing so they go beyond ‘true Bayesianism’. On my blog ( djmarsay.wordpress.com ) I comment on all the pseudo-mathematical arguments for this untrue dogma that I am aware of. I would be happy to collect more.

Dave:

If I am understanding you correctly, the main point is folks mistaking the representation with what it is fallibly trying to represent (see my first comment above.).

This happens all over the place, I remember my finance prof in MBA school responding to my comment “that’s how that model says things are revalued” with “what model?, that is how things are revalued!”.

Kass, R. (2011), Statistical Inference: The Big Picture perhaps makes this point clearly (as did, I think Dawid) – the relevant outtake in my talk abstract here – https://carleton.ca/math/2015/precise-answers-wrong-questions/

Keith O’Rourke

Keith, I agree that from a pragmatic view, Kass is making much the same basic point, and he usefully goes much further in detail (although I haven’t checked them). But his basic approach is quite different. Dawid and I are trying to establish logically sound foundations, with no hidden assumptions. Kass – as I read it – is more like fixing the problems with contemporary practice to overcome some manifest problems. Both are valuable.

Interesting – from a quick glance at your page on Ramsay ” Actual samples are never actually random: but the theories that describe them may be random, and may be a good match.”

This seems to be taken almost literally from Peirce, as do ” a psychological phenomenon to be defined and measured by a psychologist. But this sort of psychology goes a very little way and would be quite unacceptable in a developed science” and “We are all convinced by inductive arguments, and our conviction is reasonable because the world is so constituted that inductive arguments lead on the whole to true opinions.”

It is well known that Ramsay read and acknowledged Peirce widely.

Anyway, I’ll keep on eye on your web site – thanks.

Keith, I haven’t read enough Peirce to be confident in interpreting him, but it seems to me that he is concerned with the non-controversial cases, otherwise warning of the dangers of applying the theory. Looking at his theory of probable inference, though, his notion that induction only seems to work because we only attend to things for which it works seems quite credible. (Studies in Logic, pp. 175-8).

> non-controversial cases,

Hmm, this might suggest otherwise https://en.wikisource.org/wiki/A_Neglected_Argument_for_the_Reality_of_God

But it is difficult to get a sense of Peirce’s overall interests.

Interesting. Your link refers to what I might call a meta-theory of probability. He claims that science has been very successful at uncovering truths and hence (by naïve induction) we should expect the result of induction – as used in science – to be correct. (Keynes says something similar.) But it seems doubtful that induction (and statistics) have been much good at uncovering truths in economics, so his argument does not apply. It seems to me that Peirce fell foul of his own critique: that we only attend to things for which our methods work.

I regard his probability theory as reasonable, his broader thoughts (as in your link) much less so. The challenge is to try to characterise situations in which his beliefs are particularly misleading.

Dave: Peirce certainly rejected the “straight rule” of induction as a poor way of learning from error. He defined induction as severe testing (very much as I do) and argued that success in science owes to the self-correcting property of induction—a property it enjoys provided there are not blocks placed in the way, obstructing learning. He lists those blocks–reliance on belief and authority, violating predesignation (allowing biasing selection effects) and violating randomization (or similar method). He argued very explicitly against subjective probability, even using that term!

> argued very explicitly against subjective probability, even using that term!

OK, but he was also clear he was wrong about everything…

Not sure what he would think about Boxian Bayes and priors as place holders rather than blocks to inquiry.

Also, his phrase you can’t start inquiry from anywhere but where you find yourself when you start seems to underline one cannot escape prior expectations.

Oh please, it’s silly to build a philosophy for Peirce on such tautologies and ironies.

Deborah, There seem top me to be at least two Peirce’s, whom I can’t reconcile. In his probable inference he seems much like yours. In Keith’s link he seems to go far beyond his own advice. Maybe I need to work harder at interpreting him.

By the way, I see mathematical probability theories as being somewhat like Kolmogorov’s, in which the interpretation of the probability measure is left open, much as the interpretation of points and lines in Euclidean Geometry is not a part of the mathematics itself.

I see no harm in using subjective probabilities, as long as you interpret the results accordingly. On the other hand, I often delight in being irrational to the extent that my actions are incompatible with any possible probability measure. This is okay, as long as we see probability theory as just an instrument that we can use when appropriate and ignore otherwise.

> should expect the result of induction – as used in science – to be correct.

No – definitely not correct but rather eventually less wrong if good inquiry is adequately persisted in long enough. Also see Mayo’s comment below.

Keith O’Rourke

Keith, See my reply to Mayo: we seem to be talking about different Peirces. In your link he argues for the reality of God. As far as I can see one could have adapted his argument to argue for the reality of Euclidean points and lines. Or was that enquiry not good, or not persisted in long enough?

If we take Mayo’s Peirce seriously then it is hard to see how even the best enquiry persisted in for ever could ever result in Bayes’ rule producing a probable inference in favour of any theory. The most one could do is make a probable inference relative to the gamut of theories that we deem possible. But then we deduce, not that God is Real but that a belief in God is a logical consequence of our bounded imagination (or something like that). Perhaps I am misunderstanding him?

By Henrik M, similar things were written by others:

“If at some point (perhaps when looking at the posterior or learning something new that needs to be incorporated into the model) we realize that the prior is not good, we go back to the metaprior and produce a new prior that better reflects reality. Since the metaprior is documented we can assure ourselves that we are not cheating. In short: changing the prior based on only the data is wrong, but formalizing the metaprior in a different way is totally OK.”

The key question here is what it means to say that the “prior is not good”. According to standard Bayesian reasoning, the prior formalises belief/information *before* seeing the data and the function of the data is to get the posterior from the prior. If this is so, it’s not so clear how the data can contribute anything to the issue of whether the prior is “good” or not, unless the prior is interpreted in some kind of frequentist manner as “real” parameter generating mechanism.

I do realise that at times people come up with foolish formalisations of their prior belief/information and that it may happen occasionally that they only realise this after seeing the mess that is their posterior after seeing the data. But allowing changes all too easily opens all kinds of floodgates, I think… plenty of opportunity for “forking paths”. I don’t think Henrik’s suggestion can really make sure that we are “not cheating” – if Henrik’s metaprior allows a lot of flexibility, there are many “researcher degrees of freedom”.

Christian: What do you say of Dawid’s explanation of how the data enter in changing the prior: I consult my belief in some hypothesis H given x, and adjust my prior accordingly.

It seems to me that I’d get different priors if I arrived at them through the typical betting elicitation as opposed to what Dawid and Good describe. (I may be wrong as to which is the more typical way to obtain subjective priors, I’d be glad to know). Dawid had written: “For me, an ideal subjectivist analysis would involve considering what it might be reasonable for me to believe in a variety of particular cases (e.g. after seeing data of some extreme kind), and then adjusting these hypothetical posteriors, as required, to be mutually coherent (by ensuring that they could all be derived from a common prior), at the same time without grossly violating my informal assessments.”

Now let’s say I believe strongly in hypothesis H and when I see the data x disagrees with H, I find that I still feel fairly strongly in H, maybe my belief goes down a little bit. In attempting to appear fair and deny I’d obstinately believe H no matter what, I could say that “If the data had been x’ which disagrees with H much more than the observed x does, I would have a much lower posterior”, and then reflect this in my prior. But in fact, even if I’d seen x’, I would still believe strongly in H, but I didn’t have to admit that since x had been observed. So the prior I arrive at through Dawid’s method is likely to be different from the one I would obtain by playing a betting game prior to seeing the data.

But Dawid is up front in saying it’s only about inner coherence: you just nip and tuck so that it all fits together and could be reconstructed Bayesianly. He admits: “that deeper something would only be a description of my psychological state.” We are modeling Dawid’s psychological state! When it comes to communicating evidence for H, we’d be very distrustful if we knew that’s how he arrived at his reported belief in H, because we know he’d be able to work it out so H looks good (practically) no matter what. Although, if he broke down the likelihoods and prior, we could see how much work that prior is doing in saving H. Maybe this is what Savage recommends (e.g., in the 1962 Savage forum.)

I’m not sure whether I interpret your interpretation of Dawid correctly, but if I do, I don’t agree with it. I think that Dawid’s ideal subjectivist would assign the prior before seeing the data that will generate the posterior, without adapting the prior post-data. Your interpretation reads as if you think taking into account the data already fits into his “ideal” description, but this may be my mistake interpreting your text. By “hypothetical posteriors” he means thinking about what kind of posteriors his prior information would have generated if this had been a result of proper Bayesian analysis. And actually this, I think, agrees with what you get if you ask him to bet before data. So I don’t think this is so fundamentally different from de Finettian subjective Bayes.

Dawid may be willing to “reject” models at some point if he is poorly calibrated (as mentioned in another posting), potentially violating coherence, but this is a different story. I’d guess that in most such cases he wouldn’t advocate changing the prior post-data, but rather choosing the next prior before seeing the next bunch of data in a way different from using the posterior from his previous Bayesian data analysis (which an ideal coherent Bayesian would be committed to do).

Christian: I am reasonably happy with your interpretation of my views. Philip

Philip: I read your paper. Two things: (1) you still need statistical rules for falsification (as Popper allowed, referencing Fisher); that is why it became methodological falsification. I take it you know this. (2) How are the models subjected to severe tests? Unless I’m missing it, I do not see that they are,and the reason is the restriction to isolated observations (positivist style), which, as Popper would (and did) say aren’t of scientific interest, being improbable occurrences rather than genuine effects. But, as Peirce emphasized,and empiricist Popper never recognized, data combined appropriately, give a more reliable result than the individual observations (as in statistical inference). Only such genuine effects can serve as the empirical basis for a falsification, and thus for some corroboration of the model (when it passes and is not falsified).

Christian: Thank you. Given Dawid was responding to the query (about changing priors based on data), and given there were 2 equivocal sentences in his comment, I interpreted him to mean he’d go through the repertoire, after having seen the data–in those special cases where the posterior was not in sync with his belief.

These sentences were: “With that working prior I can now analyse the data I actually have. In this approach the prior is constructed, not regarded as given in advance.”

But now I gather, given he’s approved of your reading, that this is not quite right, and the prior is given in advance after the imaginary results exercise. Yet, if he chooses the new prior, planned for experiment 2, ignoring the posterior from experiment 1, is this really different (from having changed the prior from E-1)? It appears one starts over again rather than updating. Experiment E-2 might be essentially the same experiment only now with the priors expected to give posteriors more acceptable than before. And should it happen again (that the posterior doesn’t reflect his belief, maybe because the results of experiment 2 were even further away from what was expected than in experiment 1), I take it he could do it again. But I get the idea, I think.

Deborah–

My original post involved two quite distinct points:

1. When trying to be fully coherent, it is helpful to consider the various things you would want to cohere, and construct your prior accordingly. This is a purely mental exercise–no actual data involved.

2. But just because one has come up with a tentative probabilistic description of the world (perhaps, but not necessarily, by following path 1), Nature is under no obligation to conform. We need to be open to signals of this nonconformity in extensive data, and be prepared to adjust our description accordingly. I don’t think we have a fully self-consistent theory to guide this process.

I make no fundamental distinction between “statistical model” and “prior”, and ask only how well their combination serves as a description of observed data-sequences. I suspect it will more often be inadequacies of the model component, rather than of the prior, that will lead to a poor fit. We can focus our tests on that aspect if desired.

Philip

I am right in thinking from what Christian wrote

” I’d guess that in most such cases he wouldn’t advocate changing the prior post-data, but rather choosing the next prior before seeing the next bunch of data”

that once description is adjusted accordingly, that if the prior was changed importantly, that past and current data would only be used in adjusting that prior to a new prior and in a formal sense there would not be a current posterior but just a new prior?

Guess it would depend but that would appear the high road.

Keith O’Rourke

This was essentially my concern, but I just assumed starting over again was kosher on the type 2 meta-level of Bayesian rationality. This is what Jack Good told me for years, even though it would have been better to have it all thought out in advance. I’m sorry for my more flippant response earlier which began as a jokey reaction my reading (a published ! remark) that if a Bayesian principle is violated it actually confirms the principle by Bayesian lights.

I feel my distaste for blogcentric discussion returning.

Signing off…

Philip

I think “Bayesian Inference” is the phlogiston of modern science, in the sense that there is no possible way to characterize it given the incoherent representations. It seems that the subjective approach has little support (judging from comments in this blog), and that the idea of validating models has plenty of support. Gelman et al.’s text is the most popular, so why not define Bayesian Inference as modeling uncertainty, with requirements to validate the model, measure error, etc. And find another name for the other incompatible approaches?

John: I think Dawid is espousing a contemporary subjective Bayesian view, and my earlier kvetch was written too quickly (at a place where I was about to lose connection, so I had no time to fix it.) That said, I want to urge, in the most constructive way I can, that we not construe the goal of statistical inference as modeling or quantifying “uncertainty”–whatever that is. I can see calling events uncertain, but even there I’ve always felt it was murky and ill-defined. Events may be assigned a probability by a statistical model, and that calls for the means to test, reject, or corroborate hypothesized statistical models. The interest in finding things out in science involves warranting and critically evaluating general claims and models, and that is not a matter of describing how uncertain I feel about them.

John:

> the phlogiston of modern science

If you have not, you might want to read http://www.springer.com/us/book/9789400739314

Keith O’Rourke

john byrd: The problem with what you’re suggesting is that the term “uncertainty” is ambivalent. De Finetti and the subjectivists tried to make it precise, and in this respect I still think they were more successful than the Bayesians advocating other interpretations than the subjectivist one (one may think that this is not appropriate for science, as Mayo and many others do, but that’s a different aspect). There may be promising alternative attempts (including Gelman’s), but I think we are not there quite yet.

Apart from that I don’t think that the term “Bayesian inference” should be reserved for one particular way of using Bayes’ theorem for making inference, or that the term should be awarded by majority vote. What’s Bayesian should be called Bayesian; if you want to be more precise, come up with more precise terms. (My current favourite is “falsificationist Bayes” for what Gelman is up to.)

Keith: I think in this framework the terms “prior” and “posterior” only make sense relative to certain data. The idea is that at any point in time there are distributions encoding your beliefs.

These are “priors” in the sense that if you then encounter new data, you use your current belief distribution as prior. On the other hand, if your current belief distribution is a result of Bayesian updating of a former prior by some data that you’ve already seen, it’s in this sense a posterior. There is no contradiction between the same distribution being a prior relative to future data and a posterior relative to past data.

Can you really say that all these disparate approaches to, say, establishing priors are based on Bayes’ theorem or utilize the theorem?

Bayes’ theorem isn’t about establishing priors, but about how to use them once you have them.

Christian: I’m unable to post under you. The reason Bayesianism, in any form, is at odds with falsificationism, and we really need to write this as methodological falsificationism, as there is no other type unless possibly when the hypothesis is “all swans are white” and you can unproblematically observe a non-white swan, is that a Bayesian probability requires an exhaustive set of hypotheses, possibly with a “Bayesian catchall factor” left over. So you never get anything new directly from Bayes theorem, which is deductive.

So far as I can tell, even Gelman’s posterior predictive checks depend on a full set of possibilities in order to get the posterior.But this needn’t preclude identifying a problem with the model, assuming the priors.

In any event, the methodological falsification rule must be statistical. In statistics and most everywhere else, the method for falsifying depends on a rule for doing TWO things (at minimum) (1)determining there’s a genuine anomaly for claim H and whatever other assumptions are used, and (2) pinpointing the source of the anomaly to H. Otherwise there is no falsification of H (there’s a just a disjunction of something wrong somewhere).

Consider first (1) just finding a genuine anomaly without pinpointed blame, i.e., without yet solving the Duhemian problem. As Popper and Fisher rightly point out, an isolated misfit counts for nothing, one requires a genuine effect. Thus, in any interesting case, and certainly in the kinds of models Gelman has in mind, a falsification requires an inference to a genuine effect (which may be anomalous for H). Thus one needs an account of statistical inference just to get evidence of a genuine anomaly.

Error statistical methods are the prime methods for posing simple questions in order to ascertain if there’s a genuine effect.

What about the Duhemian problems in (2)? The N-P –Fisher trick is basically to get the p-value to hold apart from nuisance parameters. At least in a broad set of cases this can be done and is behind the striving for “Neyman structure” wherein the type 1 error is alpha, regardless of nuisance parameters. Fisher accomplishes the same thing, basically, conditioning on sufficient statistics for nuisance parameters.

Now as I understand it, the Bayesian deals with the nuisance parameters by integrating them out, which requires having prior probabilitydistributions in them. Does this accomplish the above two steps: finding a genuine anomaly, and pinpointing it as an anomaly for H? This I don’t know. (My comments on Gelman and Shalizi are at the current post: https://errorstatistics.com/2012/06/19/the-error-statistical-philosophy-and-the-practice-of-bayesian-statistics-comments-on-gelman-and-shalizi/).

Beyond this one wants a warranted improved model, and as pure significance tests warn: one is not entitled to infer some corrected form of the model that “takes care of” the observed anomaly. The pure significance test does not exhaust the space of alternatives by any means.

Adequacy of model assumptions is a separate task. Any move from the genuine statistical effect to a substantive research hypothesis is another step.

This entire piecemeal approach fits wonderfully into the error statistical philosophy, because we never have to carry around an entire framework or plan for numerous eventualities ahead of time. We don’t need a Bayesian catchall factor (everything other than H).

Christian and others: A link from Gelman’s blog today sent me to an exchange between you and he and I jotted down some thoughts on my Rejected Posts blog. I’d like to know what you think.

http://rejectedpostsofdmayo.com/2015/06/29/on-what-evidence-based-bayesians-like-gelman-really-mean-rejected-post/

Have we not learned that to many people using what they call a Bayesian approach, there is no need for a prior to be “ante-data under analysis” and no need for an exhaustive set of hypotheses, each with a well thought-out prior? There is sophisticated modeling, and an attempt to validate the models with no need to adhere to the original theorem. That is my perception of the state of affairs. Is this incorrect?

Mayo, today you linked to Andrew’s blog. My comment is that a uniform prior over theta is hardly uninformative: the probability that theta=0 is allegedly 0. If Andrew doesn’t accept this prior, why should he accept the posterior? This seems a good example where it is hard (perhaps impossible) to conceive of a meaningful prior.

Mayo: The falsification itself in “falsificationist Bayes” wouldn’t be Bayesian, it would rather follow a model misspecification test logic, not so different from what you are advocating (by the way, Dawid’s rejection of models that are badly calibrated isn’t itself Bayesian either). What is Bayesian is the use of the model (involving prior and sampling model) as long as it’s not rejected/”falsified”.

john byrd: They still have a prior and use Bayes’s theorem to get a posterior from it. True, they may do un-Bayesian things to arrive at the prior and change models at some point (some of them may do it all too easily but it seems too dogmatic to demand that it’s *never* done), but still.

Phil Dawid also wrote a paper entitled ‘The well-calibrated Bayesian’ in 1982 (see http://fitelson.org/seminar/dawid.pdf) and pointed out that there was a serious conflict between calibration and coherence when applying Bayes rule. He was also the examiner for my 1987 MD thesis in which I proposed a new theorem based on the extended ‘alternative’ form of Bayes rule that did not appear to have this problem. He accepted it and I was awarded the MD.

This theorem of ‘hypothetico-deductive probabilistic elimination’ can also be used to combine ‘personal’ probabilities with those generated by observations of nature in the same way as Bayes rule. I have used it mainly as a basis for diagnostic reasoning but it also has potential for wider application (see http://blog.oup.com/2013/09/medical-diagnosis-reasoning-probable-elimination/).

The structure of the Oxford Handbook of Clinical Diagnosis that teaches traditional diagnostic reasoning to students and young doctors is based on its application. The theorem and its proof is described in the final chapter.

After many years of trying to apply Bayes rule to diagnosis and scientific reasoning, there are still major problems as exemplified by the above discussions. It seems that simple Bayes rule is not able to model diagnostic or scientific reasoning satisfactorily. It may be time to consider other models of probability.

Huw: I wouldn’t place “diagnostic or scientific reasoning” together. That is why “calibration” for a Bayesian is different from the goal of having reported error probabilities (of methods associated with inferences) match actual ones. In some case, they may boil down to the same thing (e.g., in dealing with ordinary events), but not in general, and not in science. I always find it ironic that the Bayesian leans more toward a long-run accountant mentality than even the behavioristic frequentist (whom he chastises for stressing performance).

In one of the 3-Year Memory Lane items I just posted, I noticed that I mention a remark of Peter Gruenwald’s that is relevant to this post.

“Peter Gruenwald asked the same question I often ask: “Where are the philosophers?” [on a variety of issues in contemporary statistical science]. He raised the problem that arises when Bayesians are led to revise their priors on the grounds that they do not like the resulting posteriors.To avoid Bayesian inconsistency, he says, requires “non-Occam priors.” This should be understood, he suggests, in terms of what he calls “luckiness,” an idea he has found in Kiefer’s conditional frequentist inference.”

https://errorstatistics.com/2012/06/26/deviates-sloths-and-exiles-philosophical-remarks-on-the-ockhams-razor-workshop/

An individual who sometimes posts here sent a link to a blog of his wherein he discusses the question of this post. I will link it here, but recommend that any discussion on it take place on that person’s blog rather than here:

http://www.bayesianphilosophy.com/bayes-is-the-sum-and-product-rule/