I blogged this exactly 2 years ago here, seeking insight for my new book (Mayo 2017). Over 100 (rather varied) interesting comments ensued. This is the first time I’m incorporating blog comments into published work. You might be interested to follow the nooks and crannies from back then, or add a new comment to this.
This is one of the questions high on the “To Do” list I’ve been keeping for this blog. The question grew out of discussions of “updating and downdating” in relation to papers by Stephen Senn (2011) and Andrew Gelman (2011) in Rationality, Markets, and Morals.[i]
“As an exercise in mathematics [computing a posterior based on the client’s prior probabilities] is not superior to showing the client the data, eliciting a posterior distribution and then calculating the prior distribution; as an exercise in inference Bayesian updating does not appear to have greater claims than ‘downdating’.” (Senn, 2011, p. 59)
“If you could really express your uncertainty as a prior distribution, then you could just as well observe data and directly write your subjective posterior distribution, and there would be no need for statistical analysis at all.” (Gelman, 2011, p. 77)
But if uncertainty is not expressible as a prior, then a major lynchpin for Bayesian updating seems questionable. If you can go from the posterior to the prior, on the other hand, perhaps it can also lead you to come back and change it.
Is it legitimate to change one’s prior based on the data?
I don’t mean update it, but reject the one you had and replace it with another. My question may yield different answers depending on the particular Bayesian view. I am prepared to restrict the entire question of changing priors to Bayesian “probabilisms”, meaning the inference takes the form of updating priors to yield posteriors, or to report a comparative Bayes factor. Interpretations can vary. In many Bayesian accounts the prior probability distribution is a way of introducing prior beliefs into the analysis (as with subjective Bayesians) or, conversely, to avoid introducing prior beliefs (as with reference or conventional priors). Empirical Bayesians employ frequentist priors based on similar studies or well established theory. There are many other variants.
S. SENN: According to Senn, one test of whether an approach is Bayesian is that while “arrival of new data will, of course, require you to update your prior distribution to being a posterior distribution, no conceivable possible constellation of results can cause you to wish to change your prior distribution. If it does, you had the wrong prior distribution and this prior distribution would therefore have been wrong even for cases that did not leave you wishing to change it.” (Senn, 2011, 63)
“If you cannot go back to the drawing board, one seems stuck with priors one now regards as wrong; if one does change them, then what was the meaning of the prior as carrying prior information?” (Senn, 2011, p. 58)
I take it that Senn is referring to a Bayesian prior expressing belief. (He will correct me if I’m wrong.)[ii] Senn takes the upshot to be that priors cannot be changed based on data. Is there a principled ground for blocking such moves?
I.J. GOOD: The traditional idea was that one would have thought very hard about one’s prior before proceeding—that’s what Jack Good always said. Good advocated his device of “imaginary results” whereby one would envisage all possible results in advance (1971, p. 431) and choose a prior that you can live with whatever happens. This could take a long time! Given how difficult this would be, in practice, Good allowed
“that it is possible after all to change a prior in the light of actual experimental results” [but] rationality of type II has to be used.” (Good 1971, p. 431)
Maybe this is an example of what Senn calls requiring the informal to come to the rescue of the formal? Good was commenting on D. J. Bartholomew [iii] in the same wonderful volume (edited by Godambe and Sprott).
D. LINDLEY: According to subjective Bayesian Dennis Lindley:
“[I]f a prior leads to an unacceptable posterior then I modify it to cohere with properties that seem desirable in the inference.”(Lindley 1971, p. 436)
This would seem to open the door to all kinds of verification biases, wouldn’t it? This is the same Lindley who famously declared:
“I am often asked if the method gives the right answer: or, more particularly, how do you know if you have got the right prior. My reply is that I don’t know what is meant by “right” in this context. The Bayesian theory is about coherence, not about right or wrong.” (1976, p. 359)
H. KYBURG: Philosopher Henry Kyburg (who wrote a book on subjective probability, but was or became a frequentist) gives what I took to be the standard line (for subjective Bayesians at least):
There is no way I can be in error in my prior distribution for μ ––unless I make a logical error–… . It is that very fact that makes this prior distribution perniciously subjective. It represents an assumption that has consequences, but cannot be corrected by criticism or further evidence.” (Kyburg 1993, p. 147)
It can be updated of course via Bayes rule.
D.R. COX: While recognizing the serious problem of “temporal incoherence”, (a violation of diachronic Bayes updating), David Cox writes:
“On the other hand [temporal coherency] is not inevitable and there is nothing intrinsically inconsistent in changing prior assessments” in the light of data; however, the danger is that “even initially very surprising effects can post hoc be made to seem plausible.” (Cox 2006, p. 78)
An analogous worry would arise, Cox notes, if frequentists permit data dependent selections of hypotheses (significance seeking, cherry picking, etc). However, frequentists (if they are not to be guilty of cheating) would need to take into account any adjustments to the overall error probabilities of the test. But the Bayesian is not in the business of computing error probabilities associated with a method for reaching posteriors. At least not traditionally. Would Bayesians even be required to report such shifts of priors? (A principle is needed.)
What if the proposed adjustment of prior is based on the data and resulting likelihoods, rather than an impetus to ensure one’s favorite hypothesis gets a desirable posterior? After all, Jim Berger says that prior elicitation typically takes place after “the expert has already seen the data” (2006, p. 392). Do they instruct them to try not to take the data into account? Anyway, if the prior is determined post-data, then one wonders how it can be seen to reflect information distinct from the data under analysis. All the work to obtain posteriors would have been accomplished by the likelihoods. There’s also the issue of using data twice.
So what do you think is the answer? Does it differ for subjective vs conventional vs other stripes of Bayesian?
[i]Both were contributions to the RMM (2011) volume: Special Topic: Statistical Science and Philosophy of Science: Where Do (Should) They Meet in 2011 and Beyond? (edited by D. Mayo, A. Spanos, and K. Staley). The volume was an outgrowth of a 2010 conference that Spanos and I (and others) ran in London (LSE), and conversations that emerged soon after. See full list of participants, talks and sponsors here.
[iii] At first I thought Good was commenting on Lindley. Bartholomew came up in this blog in discussing when Bayesians and frequentists can agree on numbers.
Gelman, A. 2011. “Induction and Deduction in Bayesian Data Analysis.”
Senn, S. 2011. “You May Believe You Are a Bayesian But You Are Probably Wrong.”
Berger, J. O. 2006. “The Case for Objective Bayesian Analysis.”
Discussions and Responses on Senn and Gelman can be found searching this blog:
Berger, J. O. 2006. “The Case for Objective Bayesian Analysis.” Bayesian Analysis 1 (3): 385–402.
Cox, D. R. 2006. Principles of Statistical Inference. Cambridge, UK: Cambridge University Press.
Mayo, D. G. 2017. Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars. Cambridge.