A couple of readers sent me notes about a recent post on (Normal Deviate)* that introduces the term “frequentist
pursuit”:
“If we manipulate the data to get a posterior that mimics the frequentist answer, is this really a success for Bayesian inference? Is it really Bayesian inference at all? Similarly, if we choose a carefully constructed prior just to mimic a frequentist answer, is it really Bayesian inference? We call Bayesian inference which is carefully manipulated to force an answer with good frequentist behavior, frequentist pursuit. There is nothing wrong with it, but why bother?
If you want good frequentist properties just use the frequentist estimator.”(Robins and Wasserman)
I take it that the Bayesian response to the question (“why bother?”) is that the computations yield that magical posterior (never mind just how to interpret them).
Cox and Mayo 2010 say, about a particular example of “frequentist envy pursuit”:
“Reference priors yield inferences with some good frequentist properties, at least in one-dimensional problems – a feature usually called matching. … First, as is generally true in science, the fact that a theory can be made to match known successes does not redound as strongly to that theory as did the successes that emanated from first principles or basic foundations. This must be especially so where achieving the matches seems to impose swallowing violations of its initial basic theories or principles.
Even if there are some cases where good frequentist solutions are more neatly generated through Bayesian machinery, it would show only their technical value for goals that differ fundamentally from their own.” (301)
Imitation, some say, is the most sincere form of flattery. I don’t agree, but doubtless it is a good thing that we see a degree of self-imposed and/or subliminal frequentist constraints on much Bayesian work in practice. Some (many?) Bayesians suggest that this is merely a nice extra rather than necessary, forfeiting the (non-trivial) pursuit of frequentist (error statistical foundations) for Bayesian pursuits**.
*I had noticed this, but had no time to work through the thicket of the example he considers. I welcome a very simple upshot.
**At least some of them.
“I take it that the Bayesian response to the question (“why bother?”) is that the computations yield that magical posterior (never mind just how to interpret them).”
In this case, the actual Bayesian response was, “And what I [Chris Sims] did involved no ‘frequentist pursuit’ whatsoever. It is just straightforward infinite-dimensional Bayesian inference with the correct likellihood. The simple information-wasting Bayesian method I derived aimed at achieving simplicity by ignoring information, not at duplicating frequentist properties — it actually improves on Horwitz-Thompson without having started out with that objective.”
I can’t help you with a summary — I’m working on other things, so I haven’t gone through the arguments and counterarguments in detail.
Corey: I read his response, and subsequent discussion, but found it inexplicable (he was achieving simplicity by ignoring information?), as well as overly defensive. But I’m not sure what turns on this particular example.
What turns on the example is whether, in an estimator, one can simultaneously have uniform consistency for a function of a parameter and also the property of being a natural Bayesian estimate, summarizing the posterior for that parameter.
In low-dimensional settings it’s well-known we can have both, so long as one doesn’t use a “silly” prior (for want of a better term) that rules out great swathes of parameter space. Going by the criterion of guaranteeing uniform consistency, there’s therefore no reason to criticize plausible Bayesian analyses.
But in a (very) high-dimensional example, Wasserman and Robins maintain you cannot have both, at least without contriving a particular form of prior, so in general high-dimensional settings one is forced to choose between the properties. Sims counters that priors of this form are plausibly what one would use in this example anyway, without contrivance. In effect, he argues that one gets uniform consistency from the standard Bayesian analyses applied to this problem – though certainly not from all Bayesian analyses of it – and hence that no criticism of plausible Bayesian analyses follows.
Guest: Thanks so much for this useful explication. And so, do you agree with him (Sims)?
Glad you found it helpful. I haven’t yet worked through Sim’s math in detail, so while I suspect I’ll agree with him I’m not yet sure. Having W&R put the opposite argument is pause for considerable thought.