Andrew Gelman says that as a philosopher, I should appreciate his blog today in which he records his frustration: “Against aggressive definitions: No, I don’t think it helps to describe Bayes as ‘the analysis of subjective beliefs’…” Gelman writes:

I get frustrated with what might be called “aggressive definitions,” where people use a restrictive definition of something they don’t like. For example, Larry Wasserman writes (as reported by Deborah Mayo):

“I wish people were clearer about what Bayes is/is not and what frequentist inference is/is not. Bayes is the analysis of subjective beliefs but provides no frequency guarantees. Frequentist inference is about making procedures that have frequency guarantees but makes no pretense of representing anyone’s beliefs.”

I’ll accept Larry’s definition of frequentist inference. But as for his definition of Bayesian inference: No no no no no. The probabilities we use in our Bayesian inference are not subjective, or, they’re no more subjective than the logistic regressions and normal distributions and Poisson distributions and so forth that fill up all the textbooks on frequentist inference.

To quickly record some of my own frustrations:*: First, I would disagree with Wasserman’s characterization of frequentist inference, but as is clear from Larry’s comments to (my reaction to him), I think he concurs that he was just giving a broad contrast. Please see Note [1] for a remark from my post: Comments on Wasserman’s “what is Bayesian/frequentist inference?” Also relevant is a Gelman post on the Bayesian name: [2].

Second, Gelman’s “no more subjective than…” evokes remarks I’ve made before. For example, in “What should philosophers of science do…” I wrote:

Arguments given for some very popular slogans (mostly by non-philosophers), are too readily taken on faith as canon by others, and are repeated as gospel. Examples are easily found: all models are false, no models are falsifiable, everything is subjective, or equally subjective and objective, and the only properly epistemological use of probability is to supply posterior probabilities for quantifying actual or rational degrees of belief. Then there is the cluster of “howlers” allegedly committed by frequentist error statistical methods repeated verbatim (discussed on this blog).

I’ve written a lot about objectivity on this blog, e.g., here, here and here (and in real life), but what’s the point if people just rehearse the “everything is a mixture…” line, without making deeply important distinctions? I really think that, next to the “all models are false” slogan, the most confusion has been engendered by the “no methods are objective” slogan. However much we may aim at objective constraints, it is often urged, we can never have “clean hands” free of the influence of beliefs and interests, and we invariably sully methods of inquiry by the entry of background beliefs and personal judgments in their specification and interpretation.

There are indeed numerous choices in collecting, analyzing, modeling, and drawing inferences from data, and in determining what inferences they warrant about scientific claims of interest. But why suppose that this introduces subjectivity into an account, or worse, means that all accounts are in the same boat as regards subjective factors? They most certainly are not. An account of inference shows itself to be objective precisely in how it steps up to the plate in handling these choices and methodological decisions.

While it is obvious that human judgments and human measurements are involved, it is too trivial an observation to help us distinguish among the very different ways judgments should enter, and how threats of bias and unwarranted inferences may nevertheless be avoided. The issue is not that a human is doing the measuring, **the issue is whether what is being measured is something we can reliably use to find things out, i.e., solve some problem of inquiry**. This last sentence needs unpacking; there are three distinct points.

(1) *Relevance*: The process should be relevant to learning what is being measured. Having an uncontroversial way to measure something is not enough to make it relevant to solving a knowledge-based problem of inquiry.

(2) *Reliably capable*: The process should not routinely or often declare a problem solved when it is not (or solved incorrectly)–whatever the nature of the problem. The process should be capable of controlling reports of erroneous solutions to problems with some reliability.

(3) *Able to learn from error*: If the problem is not solved (or poorly solved) at a given stage, the method will enable pinpointing the reason why, or set the stage for finding this out.

I think it is point (1) that is overlooked by many, and I would like to call your attention to it. It is common for “conventional” or , as many of them prefer, “O-Bayesians” (e.g., Bernardo) to declare that their methods are just as objective as frequentist methods because they (only!) assume the statistical model and the data. For example, here. Point (2) of course, gets at severity, and (3) directly ties to what I have called solving Duhemian problems. The issue, as I see it, is not a matter of dubbing all of a domain objective or not, but of clarifying the criteria to discern the truth about some criticisms. For starters, why should the fact of discretionary choices show methods fall down on these jobs of objective inquiry?

We have discussed the dirty hands argument a few times on this blog. (See for example Objectivity (#4) and the “Argument From Discretion” ). There are a number of ways in which the argument takes root—all, I claim are fallacious.To try to give these arguments a run for their money, I’ve tried to see why they look so plausible. One route is to view the reasoning as follows:

- A variety of human judgments go into specifying experiments, tests, and models.
- Because there is latitude and discretion in these specifications, they are “subjective”.
- Whether data are taken as evidence for a statistical hypothesis or model depends on these subjective methodological choices.Therefore statistical inference and modeling is invariably subjective, if only in part.

To avoided loaded terms, call the methodological choices “discretionary”. Because of the discretionary choices in inquiry, we invariably get our hands dirty. Therefore, our conclusions cannot be pristine or objective.

The discretionary choice of a very insensitive test for detecting a positive discrepancy d’ from a 0 null, for example, results in a test with low probability of finding statistical significance even if a discrepancy as large as d’, exists. But that does not prevent me from determining, objectively, that an insignificant difference in that test fails to warrant inferring the discrepancy is less than d’. The inference would pass with low severity. We call this identifying a fallacy of negative or insignificant results. But notice: it’s error statistical reasoning that enters to correct an application of an error statistical method. *These methods are self-correcting!*

In this connection, see the blogpost (in which Gelman also figures): P-values can’t be trusted except when used to argue that P-values can’t be trusted!

Accounts that boast great flexibility and latitude do not enjoy this self-critical feature. Ironically, critics often make use of error-statistical reasoning in making out their criticisms of error statistical methods, while at the same time endorsing methods whose great flexibility and latitude frees them from error statistical constraints!

And what really frustrates me is the confusion (subliminal perhaps) between an account that recognizes how biases can color inference, and one that allows biases to enter into the inference. This calls to mind the discussion that sprung up in relation to Nate Silver. Whether or not he meant it, what he said, and said more than once, is that we should be Bayesian because it lets us explicitly introduce our biases into the data analysis! See for example:
(8/6) What did Nate Silver just say? Blogging the JSM
(8/9) 11^{th} bullet, multiple choice question, and last thoughts on the JSM.

We take a different tact. Your analysis might be a product of your desire not to find an effect, of insufficient funds to collect a larger sample, of ethics, or of the inadvertent choice of a bureaucrat. But my critical evaluation of what the resulting data do and do not indicate need not itself be a matter of desires, economics, ethics, or what have you. If I were not skeptical enough already, knowledge of a researcher’s self-interest in a result may well motivate me to scrutinize his claims all the more, but that reflects a distinct interest—an interest in not being misled, an interest in finding out what the case is, and others of an epistemic nature.

Note: there’s a big difference here—typically overlooked– between using “background beliefs” to alter an inference, and “using” them as a motivation to scrutinize an inference. I had a long exchange with Gelman once on this issue. He was criticizing frequentists for ignoring background, and I don’t think I got through to him that we (error statisticians) use background to scrutinize (and improve) all stages of inquiry. Without recognizing this, he perhaps inadvertently adds fuel to the fire against the frequentist error statistician’s use of background to promote objective scrutiny. See especially,”How should prior information enter in statistical inference”, “background knowledge: not to quantify but to avoid being misled by subjective beliefs”, and one of my deconstructions of Gelman: “Last part (3) of the deconstruction: beauty and background knowledge”.

There are parallels between learning from statistical experiments and learning from observations in general. The problem in objectively interpreting observations is that observations are always relative to the particular instrument or observation scheme employed. But we are often aware not only of the fact that observation schemes influence what we observe but also of how. How much noise are they likely to produce and how might we subtract them out. That’s the core strength of the error statistical approach.

The result of a statistical test need only be partly determined by the specification of the categories (e.g., when a result counts as statistically significant); it is also determined by the underlying scientific phenomenon, as modeled. What enables objective learning to take place is the possibility of devising means for taking account of the influence of test specifications. Frequentist error probabilities enable us to do this by letting us evaluate and control the capabilities of our tools to find flaws in attempted solutions to problems. That’s the basis for ensuring that before inferring a claim H, we have not only “sincerely tried to find flaws” (as Popper put it), but that we have successfully probed them. Any statistical account which cannot make use of error probabilities associated with methods is one that forfeits this critical self-control.

[1] Comments on Wasserman’s “what is Bayesian/frequentist inference?”

I wrote: “But I do have serious concerns that in his understandable desire (1) to be even-handed (hammers and screwdrivers are for different purposes, both perfectly kosher tools), as well as (2) to give a succinct sum-up of methods, Wasserman may encourage misrepresenting positions. Speaking only for “frequentist” sampling theorists, I would urge moving away from the recommended quick sum-up of “the goal” of frequentist inference: “Construct procedures with frequency guarantees”. If by this Wasserman means that the direct aim is to have tools with “good long run properties”, that rarely err in some long run series of applications, then I think it is misleading. In the context of scientific inference or learning, such a long-run goal, while necessary is not at all sufficient; *moreover, I claim, that satisfying this goal is actually just a byproduct of deeper inferential goals* (controlling and evaluating how severely given methods are capable of revealing/avoiding erroneous statistical interpretations of data in the case at hand.) (So I deny that it is even the main goal to which frequentist methods direct themselves.) Even arch behaviorist Neyman used power post-data to ascertain how well corroborated various hypotheses were—never mind long-run repeated applications (see one of my Neyman’s Nursery posts).”

[2] See also Gelman’s post, “What is a Bayesian”: http://andrewgelman.com/2012/07/31/what-is-a-bayesian/

*I’m posting this quickly for timeliness; I’m bound to make corrections. If significant, I’ll call it draft (ii).

Mayo:

I really do think that Bayesian priors are objective in the same sense that classical likelihood functions are objective. In both cases they depend upon some mixture of scientific judgment, feasibility, and convention, but in both cases, they can and are assigned based on hard data. See chapter 1 of BDA for several examples.

And, no, I don’t think it’s at all accurate to limit Bayesian inference to “the analysis of subjective beliefs,” just as I don’t think it would be accurate to limit classical statistical inference to “the analysis of simple random samples.” And I certainly don’t appreciate a highly restrictive definition of Bayesian inference to be used by someone as part of a criticism of Bayes! That was the main point of my post, that Larry’s definition was unnecessarily limiting (maybe describing what Jay Kadane does, but not describing Bayesian methods in general).

Andrew: Yes, that’s what you’ve written, and while I appreciate the very prompt response, I’m asking you to slow down, and think, rethink, and think again about the full dimensions of a relevant notion of scientific objectivity (as in my post).

Mayo:

I agree that there are several dimensions of subjectivity, but I don’t at all agree that “Bayes is the analysis of subjective beliefs.” That’s a great fit to what Jay Kadane does, not what I and many others do. And I think it makes more sense to trust my definition of Bayes than to trust Larry’s definition. After all, I actually do applied Bayesian statistics all the time–that should count for something.

Andrew: I think we can agree that:

(1) Wasserman’s crude definition: “Frequentist inference is about making procedures that have frequency guarantees but makes no pretense of representing anyone’s beliefs” is unhelpful and could very well include your Bayesian, and

(2) Andrew Gelman’s Bayesianism differs non-trivially from others both in methodology (e.g., testing models including priors), and philosophy.

I sometimes have referred to it as “GelmanBayes” to avoid confusion and direct readers to your post on the name game in my footnote #2.

However, Wasserman is quite right to distinguish, as he does, merely using Bayes rule from being a Bayesian (e.g., in his review of Silver). Some kind of distinction on aims and methods is useful: I consider at times the “3 P’s”–probabilism, performance, and probativeness. Because “frequentist” is a hopelessly unclear notion (in most contexts), I introduce “error statistics” and give it clear meanings (both for method and for philosophy).

You should be just as dismissive of overly crude characterizations of frequentist statistics (and frequentist statistical inference methods) as you are of oversimple conceptions of Bayesian statistics.

Mayo:

You write: “You should be just as dismissive of overly crude characterizations of frequentist statistics (and frequentist statistical inference methods) as you are of oversimple conceptions of Bayesian statistics.”

But I am! See here!

Andrew: This is not nearly sufficiently dismissive because it entertains the possibility that any frequentist anywhere ever advocates rejecting a hypothesis on grounds of an improbable event. This is just like the Kadane howler with which I began this blog (https://errorstatistics.com/2012/09/08/return-to-the-comedy-hour-on-significance-tests/). (Never mind the improbability of the event has nothing to do with the hypothesis.)

Frequentists, from Fisher on, have vehemently denied such a thing, even in cases that don’t violate the most basic requirement of a test statistic–as this case does. What I mean about violating the basic requirement of a test statistic T is that the improbability of {T>t} UNDER Ho, must track Ho. It must be improbable because of Ho adequately describing the data generation. P(T > t;Ho) for frequentists must be an analytic statement : “computed under Ho, {T > t} has probability p”. Same for other error probabilities.

Imagine telling a scientist that an improbable outcome x in a game of chance falsifies H:General Relativity. The scientist would say the person hadn’t a clue about what it means for x to falsify H. Likewise in this howler.

“Not sufficiently dismissive”??? I wrote, “I don’t like this cartoon . . . I think the lower-left panel of the cartoon unfairly misrepresents frequentist statisticians. Frequentist statisticians recognize many statistical goals. . . . The error represented in the lower-left panel of the cartoon is not quite not a problem with the classical theory of statistics . . . but perhaps it is a problem with

textbookson classical statistics . . . Still, I think the cartoon as a whole is unfair in that it compares a sensible Bayesian to a frequentist statistician who blindly follows the advice of shallow textbooks. . . .”I have seen frequentist statistical work that makes some version of the mistake shown in the xkcd cartoon but I agree that nobody would make that

particularmistake as it is so extreme. As I wrote, I didn’t like the cartoon (even though, as many commenters point out, a cartoon is supposed to be . . . well, cartoonish).Also, regarding your comment, I don’t know if one should characterize Fisher as a “frequentist.” “Anti-Bayesian” might be more accurate.

Andrew: right, not sufficiently dismissive in the sense that it doesn’t point up exactly why it’s a howler.

Fisher certainly used probability to assess and control error probabilities, be they p-values, or “sensitivity” measures. Surely you don’t think that all Fisher did, statistically, was be “anti”. And, since you bring him up, “tail areas” arise for him to avoid irrelevant (or wrong direction) test statistics.

Frequentism is about working out the objective consequences of a set of crisply stated subjective assumptions. Bayesianism, on the oher hand, is about working out the objective consequences of a set of crisply stated subjective assumptions. Why is one of these more subjective than the other?

Konrad: Ugh! Read my post. Carefully.

The desire to introduce bias stems from the the bias-variance tradeoff. Being unbiased means larger errors and statistically inefficient methods.

I don’t think “relevance” is _always_ an ideal either, due to Stein’s paradox. This is particularly true in large p scenarios (which are increasingly common).

However, I’m not sure it’s possible to put forth philosophical principles that are equally applicable to any use of statistics. I’d view “relevance” differently in a court case vs. an exploratory hypothesis generating study vs. a purely predictive / descriptive model.

vl: I was alluding to “bias” in the general fashion used in this context. Doubtless one can put forward a “philosophical” principle that sanctions any use of statistics one wants (‘anything goes’ will do), but such a principle will be at odds with the severity principle. That’s precisely my point.

You miss my statement about relevance which was that arriving at M through the cleanest hands in the world (e.g., a globally agreed upon model) does not make M relevant for finding things out.

Let me try again: in statistics (whether Bayesian or frequentist) we make claims of the form “assumptions X lead to conclusions Y” (which I’ll abbreviate as X->Y). In scientific papers we often see claims that are simply of the form Y, but these are application domain claims (telling us what to believe about the world) rather than pure statistical claims (which tell us what we can conclude from our data and assumptions, and hence are conditional on those data and assumptions).

Now, if you accept that statistical claims are X->Y claims, then your argument is that we have discretion over the choice of X and hence over which claims of this form we choose to write down. But this remains equally true in both Bayesian and frequentist frameworks (or at least if you claim a difference you have not substantiated it). You then argue for restrictions on the choice of X and imply (I think) that Bayesian approaches inherently have fewer such restrictions. My question, assuming you think that they do, is why?

Konrad:

Statistical inference also detaches a claim Y, or should. Else it’s not a statistical inference. To warrant such an inference, therefore, demands either showing X is approximately satisfied, or that we can “subtract out” or take account of threats from violations, and so on. The resulting Y’ that is warranted may (and generally will) differ from the initial Y. But one needs to be clear on the aim of the inference and the rationale for detaching.

The big difference is being constrained by what is the case wrt the problem of interest. The error statistician is required to report, at least approximately, whether formally or informally, the error-probing capabilities of the given methods used.

The issue isn’t having more restrictions or less, but constraints of a relevant kind. If the constraints are merely inner coherency, then it won’t cut-it for stringent testing to pinpoint flaws and design more reliable inquiries. If one is allowed to change one’s prior to get the desired posterior, subject to coherency, that’s not sufficient. See “updating and down-dating” https://errorstatistics.com/2012/01/26/updating-downdating-one-of-the-pieces-to-pick-up-on/

I don’t say that there aren’t practitioners who employ Bayesian computations as technical devices for purposes of achieving an error statistical end. Gelman himself says he rejects the Bayesian inductive philosophy. He may well be an error-statistical Bayesian, I don’t know.

I can’t do justice to these issues in a comment, I hope you’ll search the blog for more.

That post on up/downdating illustrates the extent to which we are referring to completely different things under Bayesian approaches. While I don’t deny the existence of subjective Bayesians, I have never come across an analysis applied to an actual scientific question which included anything remotely resembling “an initial period in which the subjective beliefs of the client are established” (for one thing, in scientific applications there are as many “clients” as there are readers of the work, and the researchers don’t have access to the clients at any stage of the project). Can I take it, then, that you would only describe _subjective_ Bayesian approaches as subjective?

“If one is allowed to change one’s prior to get the desired posterior, subject to coherency, that’s not sufficient.” – In almost all useful applications, the likelihood function has a far greater effect on the posterior. Is it ok to change one’s likelihood function to get the desired posterior? If yes, what is the justification for treating different parts of the model differently? If no, do you still claim a difference in subjectivity between Bayesian and frequentist approaches?

Konrad: It would not be OK.The frequentist does not change her likelihood to get the desired posterior because, for starters, she’s not computing the posterior (unless of course it’s an ordinary computation of frequentist probabilities.) But changing it to get the desired numerical frequentist output would be barred because it would alter the error probabilities.

The guys are getting all aggressive, jumping on Larry, over at Gelman’s blog. They are entirely overlooking that Wasserman’s reaction was in the spirit of my deconstruction of him and the metaphors/jokey analogies beginning with Wasserman’s reference to an Al Franken quip: https://errorstatistics.com/2013/12/27/deconstructing-larry-wasserman/

The problem with the terms “objective/subjective” is that they are loaded for most people, and are often used in a mixed descriptive/normative way. In science, many people whose methods are branded “subjective” will try to defend themselves (I have some admiration for de Finetti’s rather proud use of this term). I personally don’t think that “our methods/approach are more objective/less subjective than somebody else’s” is a very helpful discussion, although I acknowledge that Mayo has done much in order to give her use of “objectivity” a proper and clear explanation. Still, as long as not everybody has signed up to this explanation, discussing about how objective/subjective the approaches are has a high potential of being fruitless.

Christian: I get your point but if I thought this was so, I would consider that doing philosophy of science was fruitless. Sure I admit it may be a waste of time…but someone may be listening, and thereby have a way to answer charges of “it’s all subjective”. If not, there’s always the next generation.

I’d think that philosophy of science has some fruitier things to do… I’m not even saying that discussing the *issue* is fruitless, but one could use terms that are more precise and less prone to misunderstanding and diverging interpretations (such as reproducibility, transparency, openness to empirical falsification etc.).

Christian: I’m not afraid of articulating “objectivity”, and have published papers on it. I think it’s an extremely important concept, integral to science, and as I said, I’m focusing on “objective methods of inquiry”. Falsifiability fails a criterion for science, severity (as I use it) is stronger and yet more applicable. The other two terms you give are vague.

Objectivity is not some deep, dark, mysterious concept. I think we all can agree on what’s not an objective method of inquiry for finding something out: using a method so flexible that, regardless of the data, you can claim H, even if H is false. Being wobbly on objectivity is often a way to avoid living up to it.

Mayo: But as you argue the long run is not sufficient 😉

Better vocabulary and finer distinctions between on what you and others mean by objective versus subjective likely would hasten some fruit.

My guess at a Peircian view of what you are pointing to as objective is that which forces inquiry with continued effort to lead to what’s being warranted becoming less and less dependent of the individual enquirers involved.

I do see this happening in Andrew’s repeated work (unfortunately I think it is rare for statisticians to repeatedly work on the same substantive question – perhaps David Cox’s work on Badgers and floods?)

By the way, my nasty definition of being frequentist is simply refusing to make explicit use of prior probabilities. Would refusing to make use of (un-calibrated?) posterior probabilities be less nasty?

Keith:

To delimit the question, I suggest the focus here should be on the objectivity of (empirical) methods for inquiry, inference, or solving a problem when there is the threat of error, and limited information.

“My guess at a Peircian view of what you are pointing to as objective is that which forces inquiry with continued effort to lead to what’s being warranted becoming less and less dependent of the individual enquirers involved.”

I know what you mean, but the reason I’d reject this has to do with (1) relevance again. Ask the Global Agreement on priors or whatever can be free of individual inquirers and not at all an objective way to evaluate or do inference. Nevertheless, Peirce and I are in sync on severe testing as scientific induction.

In a paper with David Cox (2010), “Objectivity and conditionality in frequentist inference”, we propose:

“Objectivity in statistics, as in science more generally, is a matter of both aims and methods. Objective science, in our view, aims to find out what is the case as regards aspects of the world, independently of our beliefs, biases and interests; thus objective methods aim for the critical control of inference and hypotheses, constraining them by evidence and checks of error.” (Cox and Mayo, 276)

In my own work I define objectivity of a method in terms of severity which in turn requires being able to assess and control error probabilities. I have a distinct (new) definition of pseudoscience.

I don’t see why repeated work on the same problem is needed for making progress in learning (which we do quite often). I agree that building up a repertoire of mistakes, checks, precautions, and background theory is vital. But the problem can be of a general type.

On your last point: telling us what you refuse to do does not yet tell us what you do. A (frequentist) error statistician uses probability to assess and control the severity of tests.

You are right about (1), I left out the getting less wrong about/lessening doubt about what is the case for past/current enquirers – but he is hard to capture in one sentence.

The need for repeated work on same problem was just to facilitate the seeing of the progression.

> uses probability to assess and control the severity of tests

Can that be done for multiple parameters and arbitrary data models? I think Senn suggested not piece-wise and David’s position last known to me – was not yet.

Yes, just as far as one can in contemporary modeling (and it differs in different fields). Spanos’ work is a good example. See also Hendry’s contribution to our conference: http://www.rmm-journal.de/downloads/Article_Hendry.pdf

Or use the RMM link on the blog page. I don’t know what the Senn reference is.

But I should emphasize that it’s quite important to be able to report when and why severity is not achieved, or where it’s not clear it has been. (Think of high energy physics–I happen to be reading about.) That’s the key for pinpointing where and why disagreements about interpretation exist, and for developing criteria for improved tests.

I wrote this on the Gelman blog: “The analogy to the ‘logistic and Poisson models that fill our textbooks’ doesn’t work. We can use our data to assess the propriety of such models. We can’t do so for subjective priors in Bayesian analysis.”

Norm! Glad to see you. Now can you please remind me where logistic and Poisson models enter—was that Gelman saying their use is just as subjective as priors? You’re right of course that these are testable.

You can’t miss it–it was right after the “No no no no.” 🙂 The full statement was “The probabilities we use in our Bayesian inference are not subjective, or, they’re no more subjective than the logistic regressions and normal distributions and Poisson distributions and so forth that fill up all the textbooks on frequentist inference.”

Norm: Right, that’s where I thought I first saw it, but then began scouring the rest of it…. Anyway, now I can say it for sure: you’re exactly right, and Gelman saying “they’re no more subjective” is misleading. that’s why it’s so important to articulate what we mean by objective/subjective method—at least to identify fairly extreme cases. If we don’t know what we’re asserting (beliefs, uncertainty, betting inclinations, past frequencies, conventional or default priors), it’s impossible to know if the requirements I set out for objectivity are met.

Deborah, your phrase, “If we don’t know what we’re asserting,” has relevance to the Bayesian issue in other ways. Consider this: The Bayesians assert that there is such thing as “the” belief of the analyst, which is expressed in the subjective prior. But I suspect that if one were to ask that same analyst some time later (days, weeks, whatever, enough time to forget) to set a prior then–with NO new information during the intervening time–to set a prior for the same data, then he/she would come up with a DIFFEFERENT prior! Hey, what happened to the notion of “the” belief of the analyst? Maybe there should be a prior describing the analyst’s (many, and likely random) priors?

Norm: What you say is almost exactly E.S. Pearson’s words:

in “Some Thoughts on Statistical Inference”.

“It seems to me that in many situations, if I received no more relevant knowledge in the interval and could forget the figures I had produced before, I might quote at intervals widely different Bayesian probabilities for the same set of states, simply because I should be attempting what would be for me impossible and resorting to guesswork. It is difficult to see how the matter could be put to experimental test.” (E.S. Pearson, p. 278, The Selected Papers of E.S. Pearson).

It’s great that you can find such quotes directly on line (I’m traveling, no books, but know this one practically by heart).

As J. Berger and others report, this is in practice what happens from elicitation, which has mostly been regarded as a waste of time. So what about the policy cases you once wrote to me about, where enthusiastic and skeptical people are brought in to balance things? Maybe the two sides even out to a golden mean.

Norm:

I wrote a book on Bayesian data analysis which is over 600 pages long, and nowhere do I talk about “the belief of the analyst” or anything like that. So when you’re talking about “the Bayesians,” you’re not talking about me, nor are you talking about the 40,000 or so people who learned about Bayesian data analysis from our book. I think you have an old-fashioned view of Bayesian inference, which is fine–I’m sure that many of my views of things could use some updating too–but it’s not so great when you use this old-fashioned view to criticize “Bayesians” more generally.

I get the impression that Mayo does not like when people criticize various non-Bayesian ideas based on old-fashioned misunderstandings (which she calls “howlers”); I recommend you consider the same thing in your comments.

Andrew: You are right to say that “old fashioned misunderstandings” are “howlers” (especially if mounted again and again with scant recognition by the critics of how the issue has been clarified//answered). If a criticism of a method is (a) based on a blatant misinterpretation (e.g., p-values as posteriors) OR (b) while not blatant, is a challenge that has been well-clarified by at least some people, with further criticisms taken seriously, then it’s a howler. I think Jeffreys’ tail-area challenge hasn’t been fully answered, which is why I tried to address it. As such, it isn’t really a howler, but will be (I hope). I modified this terminological point on my (Jan 18 post). I was carried away by the alliteration and the fact that it’s a comedy post.

But do either allegations– that Bayesian methods are intended to capture an agent’s degree of belief, or that frequentist methods aim to control long-run errors– fall under either (a) or (b)? I don’t see how. So I don’t consider Wasserman’s rough divide as a howler (even if I’d reject it). The fact is, leading Bayesians and Bayesian texts and papers view it that way. Even quasi-Bayesian texts and articles advocate such an interpretation, usually supplemented with the identical 1 or 2 frequentist howlers (as to how a crass consideration of long runs alone can lead to goofy “confidence sets”),as if to strengthen the case to be Bayesian. These moves always surprise me, even if it’s only in the “philosophical” portion of texts, because the authors are quite sophisticated.

Even O-Bayesians often purport to be providing a stopgap measure until the full-blown subjective Bayesian ingredients are available, or as a “reference” to compare with your subjective beliefs. They then go on and generally do entirely reasonable things. (I’m reminded of “grace and amen Bayesians” in my deconstruction of Senn: https://errorstatistics.com/2012/01/15/mayo-philosophizes-on-stephen-senn-how-can-we-cultivate-senns-ability/

OK, well, I’m in an airport…let’s see if this goes through.

“leading Bayesians and Bayesian texts and papers view it that way”: S-Bayesians do; O-Bayesians do not. I don’t know what proportion of published Bayesian analyses in real applications (not counting psychological studies of people’s subjective beliefs) falls in each category, but 100% of those I am actually familiar with are in the latter.

There is a big difference between an agent’s _actual_ degree of belief (S-Bayes) and the degree of belief that would be induced in a (hypothetical) perfectly rational agent by initializing it with a precisely defined set of information (O-Bayes). For the S-Bayes agent it makes sense to refer to its degree of belief, because it has only one. For the O-Bayes agent it does not, because it can calculate as many degrees of belief in a given proposition as we have models to feed it (i.e. infinitely many).

Konrad: Do you see O-Bayesians these days claiming to capture “a (hypothetical) perfectly rational agent by initializing it with a precisely defined set of information”? I don’t, but usually see variations on Berger and Bernardo’s attempts to give priors that maximize the influence of the data (in some sense). But there’s a schizophrenia…..

The description above is my attempt to paraphrase Jaynes. I don’t know which description Berger or Bernardo would advocate, but the reference prior approach is (1) consistent with Jaynes’s description (Jaynes agreed with them on the importance of such approaches) and (2) not consistent with descriptions requiring priors to correspond to any person’s actual beliefs.

Mayo:

Matloff writes, “The Bayesians assert that there is such thing as ‘the’ belief of the analyst, which is expressed in the subjective prior.”

Fine. Except that he’s writing this in the context of a reply to

me, author of an influential Bayesian book, and I haveneverasserted anything like what Matloff is attributing to “the Bayesians.” If he’d said “other Bayesians,” that would be fine. But it’s ridiculous to respond to me with a claim that doesn’t apply to anything I’ve ever done or said.I understand the concept of “the enemy of my enemy is my friend,” and I understand that various Bayesians have said silly things that you, Mayo, find exhausting to argue against. I have a lot of sympathy for you on that point! But I don’t think it makes sense for you to reflexively agree with completely misinformed statements such as Matloff, simply because he is in some sense on your side in the debate.

Andrew: I agreed with Norm that: “they’re no more subjective” is misleading. I’ve said this in my responses to you on this blog (including your first comment) and earlier in published comments. Nothing new. At times I reply to disembodied comments that I get sent, especially while traveling. But checking the order, that’s what I’m agreeing to. Did I miss something? (These horrible comments don’t always come in the right order. I’d like to revamp the blog but don’t have time now.)

Oh, I did also remark on the similarity between what Norm wrote and E.S. Pearson’s remark.

All: Wasserman, Christian Hennig and maybe others, over at Gelman’s blog, are (gently) reminding Gelman that if you’re going to get angry about someone viewing Bayesians as using probability to measure degrees of belief, it would be good to give us the intended, alternate meaning. My comment on Gelman and Robert (2013) bears on this:

https://errorstatistics.com/2012/12/03/mayo-commentary-on-gelman-robert/

Gelman and Robert: “Bayesians will … assign a probability distribution to a parameter that one could not possibly imagine to have been generated by a random process, … .There is no inconsistency in this opposition once one realizes that priors are not reflections of a hidden “truth” but rather evaluations of the modeler’s uncertainty about the parameter.” (Gelman and Robert pp. 9-10; emphasis mine)

But it is precisely the introduction of “the modeler’s uncertainty about the parameter” that is so much at the heart of questions involving the understanding and justification of Bayesian methods.

Mayo, D. G. (2013): Discussion: Bayesian Methods: Applied? Yes. Philosophical Defense? In Flux, The American Statistician, 67(1): 11-15. (Commentary on A. Gelman and C. Robert “Not only defended but also applied”: The perceived absurdity of Bayesian inference” (with discussion).

Chances are you have already seen this one by Judea Perl, which is my favourite over there:

http://andrewgelman.com/2014/01/16/22571/#comment-153285

Christian: Your assessments of chances would be wrong, I had not seen it, so thanks for sharing. It’s excellent. I went there yesterday only because of the direct reference to my blogpost; I rarely go to Gelman’s blog any more because there’s an overly rude guy who loves to bully and badmouth me.