“What ever happened to Bayesian foundations?” was one of the final topics of our seminar (Mayo/SpanosPhil6334). In the past 15 years or so, not only have (some? most?) Bayesians come to accept violations of the Likelihood Principle, they have also tended to disown Dutch Book arguments, and the very idea of inductive inference as updating beliefs by Bayesian conditionalization has evanescencd. In one of Thursday’s readings, by Baccus, Kyburg, and Thalos (1990)[1], it is argued that under certain conditions, it is never a rational course of action to change belief by Bayesian conditionalization. Here’s a short snippet for your Saturday night reading (the full paper is https://errorstatistics.files.wordpress.com/2014/05/bacchus_kyburg_thalos-against-conditionalization.pdf):
“We will argue that to change one’s beliefs always by [Bayesian] conditionalization on evidence is to determine once and for all the impact or import of evidence. For the temporarily irrational believer, this is epistemically fatal.
If a believer starts out doxastic life with an unreasonable set of beliefs, there is no telling when, if ever, that believer may achieve rationality just by conditionalizing on new evidence. Consider an agent who believes an outright contradiction, and suppose that this agent is a perfect logician. If this believer is in possession of contradictory beliefs, then she will know this fact about herself. Now if [Bayesian] conditionalization is the truth about rational change of belief, then such a believer has no rational way of simply ‘converting’ to rationality. So in the case of this believer, we are inclined to say that conditionalization is never a rational way to change her belief. The exceedingly rational option, and the only rational one available to her in our view, is just ‘conversion’ to rationality.
What is that you say, gentle reader? You think that it is just not possible for someone to believe a contradiction? All right. But surely you believe that it is possible that someone be in possession of a distribution, call it P, over an algebra of beliefs which, though it does not yield a contradiction outright, is nonetheless incoherent—in the technical sense that it violates the probability axioms[2]. Now this unfortunate believer can never come to have coherent beliefs merely by conditionalization. How is this so? Let P’ be any member of the set of probability distributions over the set of sentences in our poor believer’s body of belief which are coherent. But since P’ is coherent and P is not, it can never be that
(*) P’(A) = P(A| ΛEi) = P(A & ΛEi)/P(ΛEi),
where ΛEi names the set of all those propositions which our unhappy agent ever does (or can, if you like) come to learn and upon which she conditionalizes; for P is just incoherent, by hypothesis, and if (*) were true, then our hypothesis would be false and the example altered. Hence the incoherent conditionalizer can never achieve coherence.
Now we should think that if one advocated coherence (in the sense that one championed the probability axioms in one’s own doxastic life and enjoined them upon others), then one would and ought to say that in this case it is never a rational change of belief to change belief by conditionalization. We do not tout the probability axioms in the same way, but even so we say this: the only rational course of action for a believer who believes irrationally and knows himself to believe irrationally is to ‘convert’ to rationality.”
Share your thoughts. This calls to mind a remark by Stephen Senn’s:
A related problem is that of Bayesian conversion. Suppose that you are not currently a Bayesian. What does this mean? It means that you currently own up to a series of probability statements that do not form a coherent set. How do you become Bayesian? This can only happen by eliminating (or replacing or modifying) some of the probability statements until you do have a coherent set. However, this is tantamount to saying that probability statements can be disowned and if they can be disowned once, it is difficult to see why they cannot be disowened repeatedly but this would seem to be a recipe for allowing individuals to pick and choose when to be Bayesian.”(Senn, p. 59)
A blogpost on Senn’s article when it first appeared is here; you can search for U-Phil contributions on Senn’s article, e.g., here.
[1]Philosopher Henry Kyburg, Jr. was an old friend (and important supporter when I was just starting out). He had his own Kyburgian frequentist philosophy. Kyburg (1993, 146) shows that for any body of evidence there are prior probabilities in a hypothesis H that, while non extreme, will result in two scientists having posterior probabilities in H that differ by as much as one wants–thereby turning the tables on popular convergence claims.
[2] Recall that violations of the Likelihood Principle lead to incoherence: “if we have two pieces of data x* and y* with [proportional] likelihood function ….the inferences about m from the two data sets should be the same. This is not usually true in the orthodox [frequentist] theory and its falsity in that theory is an example of its incoherence. (Lindley 1976, p. 36).
Bacchus, F., Kyburg Jr, H.E., and M. Thalos (1990). “Against conditionalization”, Synthese 85: 475-506.
Kyburg Jr, H.E. (1993). “The Scope of Bayesian Reasoning”, PSA 1992, vol. 2: 139-152.
Lindley D. V. (1976). Bayesian Statistics. In Foundations of Probability theory, Statistical Inference and Statistical Theories of Science, Volume 2, edited by W. L. Harper and C.A. Hooker, Dordrect, The Netherlands: D.. Reidel: 353-362.
Senn, S. (2011), “You May Believe You Are a Bayesian But You Are Probably Wrong”,
I get really impatient with all of this philosophical wanking, both pro- and anti-subjective-Bayesian. If I discover that two claims I hold are contradictory, I do not immediately hold that all claims are simultaneously true or false, contra to the principle of explosion. Is exception-handling really so hard to understand?
Corey: Sorry if you find yourself wanking philosophical (whatever that means) on a Saturday night. Nobody said anything about explosion.
Mayo: In this case, the wanking consists of going on at length about attaching (or refusing to attach) the label “rational” to various things. “Rational” is just a word.
Corey: All words are just words. Does that make it wrong to reason at length about every subject?
I actually find myself agreeing with James.
James: You’re trying to assert that my argument proves too much, but I didn’t say “reason[ing] at length” was the problem here. Let me restate my position less concisely: it is silly to go on at length about whether to attach a certain label to a process because the label does not, in and of itself, add anything to our understanding of the process.
Corey: I just don’t see the relevance of that concern here. The topic is updating by Bayesian conditionalization.
Mayo: Assuming the authors of the paper are responding to an actual position about what counts as “rationality” asserted by some subjective Bayesians somewhere, the topic appears to be a refutation of that position.
Hence the appearance in your snippet of such quotes as, “…we are inclined to say that conditionalization is never a rational way to change her belief. The exceedingly rational option, and the only rational one available to her…” and “…one would and ought to say that in this case it is never a rational change of belief to change belief by conditionalization… we say this: the only rational course of action for a believer who believes irrationally and knows himself to believe irrationally is to ‘convert’ to rationality.”
My point is that all of this blather about saying, or being inclined to say, or being willing and obliged to say that conditionalization is not rational, adds no information to knowledge base about conditionalization. It’s just cruft surrounding the one actual useful part of the snippet, which is the observation that an “incoherent conditionalizer can never achieve coherence” (in which the term “coherence” has a technical mathematical definition). And I really hate wasting my time reading cruft.
Wait, wait, wait one cotton-pickin’ minute here.
Let E = {e11, e12, e21, e22} be a sample space. For m in {11, 12, 21, 22}, let e[m] can refer to the singleton set {e[m]}; let D1 be the event (e11 U e12) and let D2 be the event (e21 U e22). Define a measure M such that for all m, M(e[m]) = 0.25*K > 0. Since M(E) = K, so this is not a coherent probability measure unless K = 1.
Now define the “conditional” measure M(e[m] | D[n]) = M(e[m] U D[n]) / M(D[m]). Then for any n and K, M( . | D[n]) is a probability measure.
So even the principle claim of the snippet (and the title of the post) is wrong.
Here is a version of the above comment with fewer errors.
Let E = {e11, e12, e21, e22} be a sample space. Let D1 be the event {e11, e12} and let D2 be the event {e21, e22}. For m in {11, 12, 21, 22}, let e[m] refer to the singleton set {em}. Define a measure M such that for all m, M(e[m]) = 0.25*K > 0. Since M(E) = K, so this is not a coherent (i.e., probability) measure unless K = 1.
Now define the “conditional” measure such that M(e[m] | D[n]) = M(e[m] U D[n]) / M(D[n]). Then for any n and K, M( . | D[n]) is a probability measure.
So even the principle claim of the snippet (and the title of the post) is wrong.
(All of those ‘U’s should be ‘∩’s.)
OK, will get back to this issue later in May.
I cannot get the Kyburg et al. link to work. Also, is his paper 1990 or 1993? Thanks,
John: Sorry, try the new link. The Bacchus, Kyburg, and Thalos paper is 90–such a fussbudget.
(Just kidding, my excuse is that I have no Elbian help–they’re all out at the Elba Room on Saturday night.)
Subjective Bayes assumes people to be coherent from the beginning, so one may wonder whether it’s fair to expect it to lead people to coherence who start off incoherently.
My interpretation of the problem is that a big problem with subjective Bayes as a model for rationally dealing with uncertainty is that it mixes up being descriptive and normative. The intial probability assignments are interpreted to describe a person’s beliefs. Such initial assignments need to be coherent, however, which means that the person’s beliefs need to be changed in advance if they are not. They implicitly seem to assume that there is a true coherent version of every incoherent belief.
I pondered my counter-example to the claim for a bit and came up with the most general version:
Consider any non-negative set function on an arbitrary sigma-algebra. Suppose that the set function is either not a measure, or is a measure but does not have total measure 1. Suppose further that the <a href="http://en.wikipedia.org/wiki/Projection_(mathematics)"projection of the set function onto a given sigma-subalgebra is a measure with finite total measure. Then the full set function is incoherent when used as a probability measure for a Dutch book challenge, but (the equivalent of) conditioning on the sigma-subalgebra yields a correctly defined (i.e., coherent) probability measure.
That “the incoherent conditionalizer can never achieve coherence” is a straight-up false claim!
The fallacy of cherry picking in American courtrooms: Schachtman’s blog
http://schachtmanlaw.com/the-fallacy-of-cherry-picking-as-seen-in-american-courtrooms/ …
Regarding the Lindley quote, methods that violate the Likelihood Principle need not be incoherent from a Bayesian perspective, at least as I understand Bayesian uses of “incoherent.” Suppose, for instance, that I would arrive at different conclusions about the fairness of a coin that yields 9 heads and 3 tails in 12 tosses depending on whether the stopping rule is binomial or negative binomial. Then you could model me as a Bayesian agent whose prior probability for fairness depends on what experiment I’m contemplating. That looks a little weird from a Bayesian perspective, and it violates the requirement that my priors be sharp, context-independent degrees of belief that change only by conditioning on new evidence. But I don’t see that it makes me subject to a Dutch book. It might just make me a permissivist of a somewhat odd sort.
(I owe this point to Julia Staffel.)
Greg: Of course, as Bacchus, Kyburg, Thalos (1990), Howson and Urbach, and others point out, we know we can avoid a Dutch book if we want to avoid a sure loss book by mere deduction and without any kind of belief assignments. So it stands to reason that one who is incoherent by Bayesian means can as well. The question they raise is whether you can get coherent by Bayesian means.
Of course, someone with incoherent degrees of belief can avoid Dutch book by not using his or her degrees of belief as betting odds. I’m saying that someone who violates the Likelihood Principle could be regarded as having degrees of belief that are not susceptible to a Dutch book even if he or she does use them as betting odds. “Incoherent” seems to be the wrong word for such a person, even from a Bayesian point of view.
(This point is tangential with respect to the post. I agree with the main claim that actual agents often need to update their beliefs in non-Bayesian ways.)
Greg: But “actual agents often need to update their beliefs in non-Bayesian ways” isn’t the main claim. (Look at the title of the post and the punchline of the snippet — the key phrase is “can only”.)
True
Mayo: “The question [Bacchus et al.] raise is whether you can get coherent by Bayesian means.”
In what way is my counter-example *not* dispositive?
I find it kind of silly to talk about an incoherent conditionalizer…. it seems to me that such a concept is undefined. I think this is Christian’s point. Although if you were to attempt to define it, it seems to me that the “never” part of Bacchus argument is likely to be too strong – which I think is Corey’s point (although I must confess I don’t understand measure theory).
I would maintain the concept is undefined. Subjective Bayes is essentially about an audit of your beliefs. i.e. don’t simultaneously insist you want both more or less of something at the same time. Bayes theorem is just a special case of the fundamental theorem of prevision which is the general tool for identifying incoherent probability assertions.
The concept of an “update” is really a shorthand for a probability that was always specified becomes relevant under new evidence. If the original probability specification contained some sort of contradiction this needs to be corrected, by deleting and/or replacing part of the specification not by conditioning.
Greg Gandenberger:
Interesting point about the Lindley optimal stopping example.
A Bayesian shouldn’t talk about the fairness of a coin, but rather identify a consistent set of decision preferences. Usually this would be done by extending the discussion to additional throws. I haven’t thought it through in detail, but I would speculate that in order to avoid incoherence it would be necessary to make probability specifications that are not exchangeably extendable on these additional throws. In short you would need to make a pretty crazy specification in order to remain coherent…
David: I take it you mean “optional stopping”?
I had a question about the irrelevance of stopping rules from a Bayesian perspective – particularly how stopping rules affect the likelihood of the data.
A stopping rule biases the distribution of the data towards certain end goals. So, in a way, the data aren’t independent of each other. Doesn’t this mean that the likelihood calculation needs to be changed in some way to account for this lack of independence?
If the above is true, then it must be the case that if two sets of identical data, drawn from the same underlying statistical population (or type of experiment), are produced with and without a stopping rule, they cannot have the same likelihood; and the converse, if two sets of identical data have the same likelihood, and are produced with and without a stopping rule respectively, then they could not have been drawn from the same underlying statistical population (or type of experiment).
So, the stopping rule is going to be reflected both in the likelihood function, and in the priors as suggested by Greg Gandenburger.
If this is the case, then it is not the case that the stopping rule is irrelevant to a Bayesian.
yes – a typo, should be optional stopping
Likelihoods enter the Bayesian calculation as ratios and thus the constant factor cancels out. The reason the two data sets are “LP pairs” (as I call them) is precisely that they have proportional likelihoods over the parameter. I’m sure Greg knows this. As for the stopping rule altering the priors, well if it does, it is a violation of the LP. (And ask yourself why it should enter. See discussion in Mayo and Kruse.) I assume, by the way, that we are restricting the discussion to the relevant example. There are some extremely curious cases where the stopping rule is “informative”, but that’s irrelevant here. See many other refs to stopping rules on this blog, and Berger and Wolpert’s Likelihood Principle.
For some reason, WordPress is not informing me of comments as of a few days ago–hence I was unaware of your comment. Only found it by accident. I need to figure out how to restore the notification on the blog. Quite annoying.