Gelman responds to the comment[i] I made on my 8/31/13 post:
Popper and Jaynes
Posted by Andrew on 3 September 2013
Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with:
Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree.
Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian?
I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model.
I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physical principles and some simplifications (for example, in a finite-element model you assume the various coefficients aren’t changing over time, and you assume stability within each element), then if the model doesn’t quite work, you figure out what went wrong and you make it more realistic.
But in statistics we weren’t usually seeing this. Instead, model checking typically was placed in the category of “hypothesis testing,” where the rejection was the goal. Models to be tested were straw man, build up only to be rejected. You can see this, for example, in social science papers that list research hypotheses that are not the same as the statistical “hypotheses” being tested. A typical research hypothesis is “Y causes Z,” with the corresponding statistical hypothesis being “Y has no association with Z after controlling for X.” Jaynes’s approach—or, at least, what I took away from Jaynes’s presentation—was more simpatico to my way of doing science. And I put a lot of effort into formalizing this idea, so that the kind of modeling I talk and write about can be the kind of modeling I actually do.
I don’t want to overstate this—as I wrote earlier, Jaynes is no guru—but I do think this combination of model building and checking is important. Indeed, just as a chicken is said to be an egg’s way of making another egg, we can view inference as a way of sharpening the implications of an assumed model so that it can better be checked.
I still don’t see how one learns about falsification from Jaynes when he alleges that the entailment of x from H disappears once H is rejected. But put that aside. In my quote from Gelman 2011, he was alluding to simple significance tests–without an alternative–for checking consistency of a model; whereas, he’s now saying what he wants is to infer an alternative model, and furthermore suggests one doesn’t see this in statistical hypotheses tests. But of course Neyman-Pearson testing always has an alternative, and even Fisherian simple significance tests generally indicate a direction of departure. However, neither type of statistical test method would automatically license going directly from a rejection of one statistical hypotheses to inferring an alternative model that was constructed to account for the misfit. A parametric discrepancy,δ, from a null may be indicated if the test very probably would not have resulted in so large an observed difference, were such a discrepancy absent (i.e., when the inferred alternative passes severely). But I’m not sure Gelman is limiting himself to such alternatives.
As I wrote in a follow-up comment: “there’s no warrant to infer a particular model that happens to do a better job fitting the data x–at least on x alone. Insofar as there are many alternatives that could patch things up, an inference to one particular alternative fails to pass with severity. I don’t understand how it can be that some of the critics of the (bad) habit of some significance testers to move from rejecting the null to a particular alternative, nevertheless seem prepared to allow this in Bayesian model testing. But maybe they carry out further checks down the road; I don’t claim to really get the methods of correcting Bayesian priors (as part of a model)”
A published discussion of Gelman and Shalizi on this matter is here.
[i] My comment was:
” If followers of Jaynes agree with [one of the commentators] (and Jaynes, apparently) that as soon as H is falsified, the grounds on which the test was based disappear!—a position that is based on a fallacy– then I’m confused as to how Andrew Gelman can claim to follow Jaynes at all. “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive…” (Gelman, 2011, bottom p. 71).Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? Perhaps he’s not one of the ones in Paul’s Jaynes/Bayesian audience who is laughing, but is rather shaking his head?”