(Please see part 1 for links and references):
A Bayesian, Gelman tells us, “wants everybody else to be a non-Bayesian” (p. 5). Despite appearances, the claim need not be seen as self-contradictory, at least if we interpret it most generously, as Rule #2 (of this blog) directs. Whether or not “a Bayesian” refers to all Bayesians or only non-standard Bayesians (i.e., those wearing a hat of which Gelman approves), his meaning might be simply that when setting out with his own inquiry, he doesn’t want your favorite priors (be that beliefs or formally derived constructs) getting in the way. A Bayesian, says Gelman (in this article) is going to make inferences based on “trying to extract information from the data” in order to determine what to infer or believe (substitute your preferred form of output) about some aspect of a population (or mechanism) generating the data, as modeled. He just doesn’t want the “information from the data” muddied by your particular background knowledge. He would only have to subtract out all of this “funny business” to get at your likelihoods. He would only have to “divide away” your prior distributions before getting to his own analysis (p. 5). As in Gelman’s trial analogy (p. 5.), he prefers to combine your “raw data,” and your likelihoods, with his own well-considered background information. We can leave open whether he will compute posteriors (at least in the manner he recommends here) or not (as suggested in other work). So perhaps we have arrived at a sensible deconstruction of Gelman, free of contradiction. Whether or not this leaves texts open to some charge of disingenuity, I leave entirely to one side.
Now at this point I wonder: do Bayesian reports provide the ingredients for such “dividing away”? I take it that they’d report the priors, which could be subtracted out, but how is the rest of the background knowledge communicated and used? It would seem to include assorted background knowledge of instruments, of claims that had been sufficiently well corroborated to count as knowledge, of information about that which was not previously well tested, of flaws and biases and threats of error to take into account in future designs, etc. (as in our ESP examples 9/22 and 9/25). The evidence for any background assumptions should also be made explicit and communicated (unless it consists of trivial common knowledge).
Doesn’t Gelman’s Bayesian want all this as well? What form would all this background information take?
I see no reason to (and plenty of reasons not to) suppose that all the relevant background information for scientific inquiry enters by means of formal prior probability distributions, whether the goal is to interpret what this data, say x, indicate, or to make a more general inference given all the relevant background knowledge in science at the time, say, E. How much less so if one is not even planning to report posterior probabilities. Background information of all types enters in qualitatively to arrive at a considered judgment of what is known, and not known about the question of interest, and what subsequent fruitful questions might be.
In my own take on these matters, even in cases of statistical inference, it is useful to distinguish a minimum of three models (or sets of questions or the like), which I sometimes identify as the (primary) theoretical, statistical, and data models. Recall the “three levels” in my earlier post. If one is reporting what data x from a given study indicate about the question of interest, one may very likely report something different than when reporting, say, what all available evidence E indicate or warrant. I concur with Gelman on this. Background information enters in specifying the problem, collecting, and modeling data; drawing statistical inferences from modeled data, and in linking statistical inferences to substantive scientific questions. There is no one order either—it’s more of a cyclical arrangement.
Does Gelman agree? I am guessing he would, with the possible exception of the role of a Bayesian prior, if any, in his analysis, for purposes of injecting background information. But I’m too unclear on this to speculate.
To this same, large extent, Gelman’s view on the proper entry of background knowledge is in sync with Sir David Cox’s position; although for a minute I thought Gelman was disagreeing with Cox (about background information), this analysis suggests not. Beyond what might be extracted from the snippet from the informal (Cox-Mayo) exchange, to which Gelman refers (p. 3), Cox has done at least as much as anyone else I can think of to show us how we might generate, systematize, and organize background information, and how to establish the criteria appropriate for evaluating such information.[i]
But maybe the concordance is not all as rosy as I am suggesting here. After all, in the same article, Gelman gives a convincing example of using background information which leads him to ask:
“Where did Fisher’s principle go wrong here? (p. 3)”
To be continued . . .in part 3.
[i]I give just one old and one new reference:
Cox, D. R., (1958), Planning of Experiments. New York: John Wiley and Sons. (1992 Republished by Wiley Classics Library Edition.)
Cox, D. R., and C. A. Donnelly (2011), Principles of Applied Statistics, Cambridge: Cambridge University Press.