Bayesian philosophers vs Bayesian statisticians: Remarks on Jon Williamson

While I would agree that there are differences between Bayesian statisticians and Bayesian philosophers, those differences don’t line up with the ones drawn by Jon Williamson in his presentation to our Phil Stat Wars Forum (May 20 slides). I hope Bayesians (statisticians, or more generally, practitioners, and philosophers) will weigh in on this. 

1 Are the wars mostly in statistics, not philosophy? According to Williamson, Bayesian philosophers, while mostly subjective, just “see Bayesianism as being concerned with belief, and so not a rival” to frequentist statistics, while Bayesian statisticians do “see Bayesianism as a rival to frequentist statistical inference—‘statistics wars’.” [SLIDE 6]. So, in his view,  philosophers of statistics don’t see a war between frequentists and Bayesians whereas statisticians do. I think, if anything, it is the reverse. Bayesian statisticians are more eclectic, and much less inclined to see a war between frequentists and Bayesians. When they do so, they are, largely, wearing philosophical hats. Granted they often do not recognize that they’re making philosophical presuppositions when waging a war with frequentists, (error statisticians in my terminology), but by and large they are happy to get on with the job. By contrast, philosophers who accept a frequentist, error statistical view are generally seen as exiles, under the presumption that only Bayesianism gives a sound (coherent) underlying philosophy.[i] (That is why the alt name for my blog is “frequentists in exile”.)

I’m not saying, by the way, that the main stat wars are between frequentists and Bayesians–there are many other battles as well. I’m just addressing some of the points Williamson makes.

I also find it surprising that according to Williamson, a Bayesian statistician appears to be much more of a true blue subjective (in the manner of de Finetti) than is the Bayesian philosopher. Again, I would have thought it was the opposite. [SLIDE 6] Statisticians seem much more eclectic than philosophers in their interpretations of probability (see Gelman and Shalizi 2013). Williamson avers that Bayesian statisticians “often doubt the existence of non-epistemic probabilities (following Bruno de Finetti)”. Non-epistemic probabilities” he says are “generic frequencies or single-case chances”. Doubting the existence of frequencies and chances, the Bayesian statistician, he claims, does not seek to “directly calibrate one’s credences [degrees of belief] to nonepistemic probabilities (generic frequencies or single-case chances).”

But certainly the large class of non-subjective, default, reference, and empirical Bayesians go beyond probability as subjective degrees of belief, and increasingly separate themselves from traditional subjective and personalist philosophies. Being calibrated to frequencies in some sense, I thought, was one of the main advantages of non-subjective, objective, or default Bayesianism in statistical practice. This is so, regardless of their metaphysics on chances or propensities: it suffices to allude to relative frequencies (actual or hypothetical) stemming from modeled phenomena or deliberately designed experiments.

Having a battle about where the wars are–statistics or philosophy of statistics–may not be productive, but I think it’s very important to understand the nature of the debates. In fact, one of the most serious casualties of the statistics wars from the philosophical perspective is in obscuring the roles of statistical methods (and other formal and quasi-formal methodologies) in addressing the epistemological problem of how to generate, learn and generalize from data. In other words, the wars have confused the value of statistics for philosophy.

2. Philosophy of confirmation vs philosophy of statistics. Now there is a radical difference to which Williamson’s discussion points, and it is between a given project–it might be called Bayesian confirmation theory, Bayesian epistemology, inductive logic or the like–and what I would call philosophy of statistics, or the philosophy of inductive-statistical inference. Bayesian confirmation theorists, since Carnap, have a tradition of building an account based on a restricted language: statements, propositions, and first order logics. By contrast, statisticians and statistical philosophers refer to probability models, continuous random variables, parameters and the like. The philosophical project is essentially to justify a mode of inductive inference, basically, Carnap’s straight rule or a version of enumerative induction: if k% of A’s have been B’s then believe the next A will be B to degree k. Perhaps a stipulation that the observations satisfy a condition of randomness or exchangeability is added.  

An example from Williamson (SLIDE 7) is this: Suppose your evidence E is: 17 of a random sample of a hundred 21-year-olds develop a cough. That the sample frequency is 0.17 is evidence that the chance/frequency is ≈ 0.17. A case of what he calls a direct inference would be to take .17 as how confident one should be, or how strongly one should believe, the statement A: that Cheesewright, who is 21, gets a cough. In the confirmation philosopher’s project, there is a restriction to a finite language, set of predicates, and assignments of probabilities to chosen elements. A statistician, instead, might appeal to Bernouilli trials and a Binomial model of experiment to reach such a (direct) inference to the probability of an event occurring on the next trial.

Philosophers of confirmation sought an a priori (non empirical) way to justify such an inference–to solve the traditional problem of induction–and any reference to a probability model, or even slipping in that it’s a random sample, makes an empirical assumption. So they shied away from tackling the inductive problem using models. By the way, I don’t view inductive inference as inferring probabilities of claims, but making inferences that go beyond the data–they are ampliative. They are qualified using probabilities, but these are not posteriors in hypotheses. Even falsification requires inductive inferences in this sense (see SIST, excursion 2 Tour II, p. 83).

I don’t think philosophers still consider that an a priori justification of inductive inference is possible or even desirable. Thus, the impetus to restricting the philosophical account of inference to specially crafted first order languages goes by the board, and we can freely talk about design-based or model based probabilities. Unlike what the confirmation project typically supposes, appeals to such models may be warranted or, alternatively, falsified. Showing how is part of what is involved in solving the problem of induction now, or so I argue (2018, pp 107-115).

3. Do statisticians not move from general probabilities to specific assignments? It’s interesting that Williamson claims that “statisticians tend not to appeal to direct inference principles” that move from population probabilities and frequencies to degrees of probability, belief, or support in a particular case. The reason, if I understand him, is that they tend not to believe in these non-epistemic, frequentist probability notions–taking us back to point 1 above.

I find Bayesian statisticians/practitioners highly interested in assigning degrees of belief, credence, or plausibility to events– whether we consider that tantamount to assigning beliefs to statements about events, or to events defined in a model. In fact Bayesian statisticians see a key selling point of their methodology that it offers a way to assign probabilities to particular events and hypotheses, whereas the frequentist error statistician generally only speaks of the performance properties of methods in repetitions. The error statistician is also largely interested in inverse  inference from data to claims about aspects of the data generating method (rather than direct inference).

The latest move (by both Bayesian and frequentist practitioners) to embrace what I call a “screening model” of statistical significance tests would seen to be an example of practitioners performing direct inferences. (See SIST, excursion 5 Tour II.) Here the probability of a particular hypothesis is given by considering it to have been (randomly?) selected from a universe or urn of hypotheses (where it’s assumed some % are known to be true).

Williamson himself appeals to confidence intervals in illustrating a direct inference from a confidence level to a degree of belief in a particular interval estimate. A popular reconciliation, which I think he endorses, makes use of frequentist matching priors. Fisher’s fiducialist essentially tried to do this without appealing to priors. So again it’s not clear why he takes statisticians as uninterested in direct inference. True, contradictions result from probabilistic instantiation, as noted in my “casualties“, and in Williamson’s presentation. (His solution is to drop Bayesian conditioning and start over with new maximum entropy priors.) The error statistician will also take error probabilities of methods as qualifying a particular inference. But it is qualifying, not its degree of believability, but rather, how well tested or corroborated it is.

The upshot: I consider statistical methods far more relevant to the philosophers’ epistemological projects than I find in Williamson’s portrayal, at least based on his May 20 presentation. Philosophers of confirmation and formal epistemology shortchange themselves by keeping their projects separate from the ones that empirical statistical (and other formal and quasi-formal) methods supply. In the reverse direction, the foundational and methodological problems of these methods cannot be so readily swept aside as simply directed at a problem that is outside of those of the philosophers. In my view, this thinking has stalled progress in both arenas for the past 25 years.

Please share your comments and questions.


[i] Admittedly, there is a program of Bayesian epistemology that might be seen as doing ordinary epistemology (discussions about knowledge and beliefs) employing formal probabilities. But this is not Williamson’s project.


Categories: Phil Stat Forum, stat wars and their casualties

Post navigation

11 thoughts on “Bayesian philosophers vs Bayesian statisticians: Remarks on Jon Williamson

  1. Christian Hennig

    Just to add an observation on what goes on on the statistical side: What I find most striking is that there is a lot of Bayesian practice that isn’t in line with any Bayesian philosophy at all, in that priors are proposed for reasons of convenience or certain statistical properties such as penalisation of overfitting, but not motivated along any philosophical lines. They are neither “objective” in any well defined sense nor do they make clear reference to background knowledge. This happens while the authors, at least occasionally, make side remarks about how Bayesian philosophy is superior or frequentist methodology is inferior. For this reason it may be hard, from the philosophical side, to categorise what goes on in statistics. Sure, one finds the odd proper subjectivist or objectivist and people like Gelman, Frigessi, or others, who give proper explanations of how their priors are related to knowledge, even if in an either eclectic or “falsificationist Bayesian” sense. But to me that seems to be the minority. The majority has somehow heard that Bayes is cool and the age of frequentism is over, but not many traces of that coolness can actually be found in their work (unless the complexity of setting up ingenious computational schemes counts as cool), and the difficulty to translate existing information into a prior, which to me is the hardest thing about Bayesian statistics, is more often than not sidestepped.

  2. Deborah:

    I agree with what you wrote. I took a look at the presentation from Williamson and it just seems very out of date. There’s been a lot of changes in Bayesian statistics since Reichenbach (1935), Jeffreys (1939) and Jaynes (1957). Writing about Bayesian statistics with these references would be like writing about the practice of democracy with references of Aristotle, Hobbes, Locke, and nothing since then. It’s fine as a way to reflect upon ancient conceptions of statistics or political science, but not so relevant when addressing current practice.

  3. Pedro

    There is certainly an important difference between, say, the frequentist *interpretation of probability* and *frequentist statistics*. It’s easier to illustrate this difference with some examples.

    – A.W.F. Edwards thinks that probabilities can only ever refer to frequencies, but he vehemently rejects most of what we understand by “frequentist statistics” today.

    – In philosophy, Wesley Salmon held a frequentist interpretation of probability and he seemed to have nothing against frequentist statistics, but he’s still a Bayesian who speaks of “priors”, “confirmation of hypotheses”, etc.

    – Ramsey, a Bayesian, says in the 1920s that multiple interpretations of probability are possible, and that the one that is most useful to “logicians” will not necessarily be the most useful to statisticians or physicists. As an economist, Ramsey was trying to give an abstract *formal model* of beliefs, but that doesn’t mean we need to use this formal apparatus when we are actually making inferences (compare: it might be useful to describe people’s behavior as if it follows a utility function, but it doesn’t mean we need to think about our utility function every time we act).

  4. Thank you very much for your comments Deborah. To take them in turn:

    1. I would put those interested in the foundations of Bayesian statistics in the “Bayesian statistics” camp rather than the “Bayesian philosophy” camp. I get the impression that rather few philosophers work actively on foundations of Bayesian statistics these days. Most philosophers who use Bayesianism use it to theoriese about strength of belief in formal epistemology or confirmation and induction in the philosophy of science. I do think there is a spit between the two camps, with Bayesian philosophy tending towards direct inference and Bayesian statistics tending towards long-run calibration to frequencies, if anything. In the talk, I focussed on a problem for Bayesian philosophy (the inability of standard Bayesianism to accommodate direc inference). I understand that this might be a bit unsatisfying for someone like you who is more interested in problems for Bayesian statistics, but I do think the philosophical problem is an interesting one.

    2. The focus on logical languages when dealing with the logic of induction allows one to sidestep difficulties that arise when applying principles like the maximum entropy principle to continuous domains. It’s not really essential here.

    3. I do think the Bayesian philosophy camp and the Bayesian statistics camp are interested in very different sets of questions. Of course there is some overlap, and I do think there can be useful interactions. For example, the non-standard version of objective Bayesianism that I put forward can be applied to statistical matching. I don’t see it as a panacea for all statistical problems, however.

    Christian – I agree with your comment – thank you for that. I do think a ‘pragmatic’ approach to statistics is dominant, with decreasing interest in conceptual foundations.

    Andrew – my focus was on Bayesian philosophy, not Bayesian statistics. I’m afraid you’ll have to look elsewhere for detailed accounts of current statistical practice.

    Pedro – I think you make a very good point there and I agree that it is useful to separate some of the philosophical discussion from questions of statistical inference.

    • Jon
      Thanks for your comment. You wrote: “I do think there is a spit between the two camps, with Bayesian philosophy tending towards direct inference and Bayesian statistics tending towards long-run calibration to frequencies.” As I say in my post, I just don’t see it. I see direct inference among statistical practitioners as well as calibration to performance. You may be right that nowadays “Most philosophers who use Bayesianism use it to theorize about strength of belief in formal epistemology or confirmation and induction in the philosophy of science.” But I regard formal and quasi-formal methods in science as relevant—don’t you?

      I do mention (in the footnote) those philosophers who are strictly using probability to do analytic epistemology as an exception, where terms like “reliable” and “belief” might be used in setting out concepts without cashing them out (beyond a conditional probability formalism). I would not have placed you in that camp. It’s not that I find that unsatisfying (although even there I think it would be valuable to cash out those terms). What is unsatisfying to me would be philosophers of science giving up on the exciting project of explaining how we learn from limited data and error prone methods—and how we can do it better. That’s the real problem of induction. I take it that Carnap, Reichenback, Hacking Kyburg, Rosenkrantz, Salmon, Fetzer, Giere, Glymour—just to name some off the top of my head in no particular order—are/were interested in solving problems of explaining and justifying actual scientific inference—don’t you? Their views on how to do this shifted over time, as makes sense. That these philosophers may have given up on various early attempts doesn’t mean that wasn’t or isn’t their aim. (Carnap moved to a subjective construal after the continuum of inductive methods, Hacking declared in 1980 that there was no such thing as inductive logic in contrast to his view in 1965.) Granted, the philosopher’s concern is with a cluster of deep, long-standing issues about knowledge, about concepts of evidence, about paradoxes and problems in understanding methods and inferences that are largely unquestioned in applied practice. Granted, as well, it is often difficult to clearly identify to non-philosophers the philosophical and foundational issues underlying practice that typically go unattended by practitioners—and why they are so important and interesting! I still hold that the resources of philosophy of science (coupled with the relevant scientific knowledge) offer valuable tools for advancing solutions to problems of evidence and inference that are relevant for both the philosophical and the applied problems of knowledge, induction and inference.
      Post-positivist philosophers of the 80s, 90s and beyond pledged on their honor to be relevant to scientific practice. I take your work on objective Bayesianism to be in that spirit, even if it’s also engaging with formal epistemology, so I was and am a bit surprised at where you seemed to be drawing the division of labor, but I think I may be misunderstanding this part of your presentation.

      • Thanks very much Deborah.

        I certainly didn’t want to imply that one shouldn’t work on the foundations of statistical inference, or that I’m not interested in these foundations. It’s just that this particular talk was focussing on something else: a tension in Bayesianism as philosophers use it. I think the version of objective Bayesianism that I put forward can help to resolve that tension. But I don’t want to claim that it provides a general solution to the problem of statistical inference – at best it can help with certain specific problems.

        In general I’m a bit doubtful as to whether there is a general solution to the problem of statistical inference. Partly, that’s because of a tendency for statistical inference to over-reach. For example, I don’t think that causal inference is reducible to statistical inference – a topic that I hope to talk about at your September conference.

    • Andrew Gelman


      I don’t see the point of a “Bayesian philosophy” that refers to outmoded ideas of Bayesian statistics. That sounds like a “physics philosophy” that focuses on the problems with Ptolemaic astronomy or a “chemistry philosophy” that focuses on problems with classifying elements as earth/air/fire/water. It could be of historical interest but it would be a good idea to emphasize that you’re talking about history not current practice. Perhaps instead of calling it Bayesian philosophy you could call it “the philosophy of mid-twentieth-century Bayesianism” in the same way that someone might work on the philosophy of medieval physics or the philosophy of ancient geography or whatever. Understanding historical modes of thought can be valuable; I just recommend you label it as such.

      • Andrew: Thank you so much for your comment. I don’t know why it didn’t go up automatically–I’ll check the settings.
        Science–whatever the focus–being relevant to practice, I entirely agree. But, as I often have said, I agree that science proceeds without philosophy and that it’s only wrt certain problems and circumstances that appealing to underlying foundations and philosophical puzzles are important. I claim that statistical and other formal practices today offer many examples where philosophical illumination is much needed. There’s a kind of middle ground here–we might call it “applied philosophy of science/statistics”. Here a philosophical scrutiny is not trying to do applied stat, but is examining concepts, arguments and methods in such a way as to be relevant to illuminating debates and assumptions that generally go unattended in practice. However, this is not typically recognized by practitioners, and I would admit it takes a certain philosophical perceptiveness to appreciate this. You are among the few statisticians to explicitly talk about philosophical issues. You will recall in discussing my book on your blog, some said well she’s just doing philosophy. Yes it’s philosophy of statistics but it is relevant to understanding and making progress on a host of problems of inference both in statistics and in philosophy and logic. I’ve noticed Colopy doing a series of philosophy and data science that we were both interviewed in at different times. That’s novel for stat.
        I’m not exactly sure what Jon’s position is, but I’ve tried to explain it in my post.

      • Is the problem to do with my citing Jaynes and Jeffreys? I was citing them as people whose work on objective priors was very influential in their respective fields. I wasn’t saying that they represent the state of the art.

        The talk wasn’t about understanding historical modes of thought. I was arguing that Bayesians who advocate direct inference face a big problem: plausible uses of direct inference principles lead to inconsistency in a standard Bayesian framework. On the other hand, a non-standard version of objective Bayesianism can accommodate direct inference perfectly well. The state of the art with regard to objective priors – interesting though it is – has no bearing on this overall argument.

  5. rkenett

    The talk by Jon Williamson and related discussion are apparently based on the assumption (yet to be tested) that everyone involved is interested in solving problems of explaining and justifying actual scientific inference. The approach based on such an assumption should be both top down and bottom up, i.e. start with problems. Starting this way gets you to consider information quality. This should interest practitioners and philosophers alike. Doe it?

Blog at