Andrew Gelman had said he would go back to explain why he sided with Neyman over Fisher in relation to a big, famous argument discussed on my Feb. 16, 2013 post: “Fisher and Neyman after anger management?”, and I just received an e-mail from Andrew saying that he has done so: “In which I side with Neyman over Fisher”. (I’m not sure what Senn’s reply might be.) Here it is:

“In which I side with Neyman over Fisher” Posted by Andrew on 24 May 2013, 9:28 am

As a data analyst and a scientist, Fisher > Neyman, no question. But as a theorist, Fisher came up with ideas that worked just fine in his applications but can fall apart when people try to apply them too generally.

Here’s an example that recently came up.

Deborah Mayo pointed me to a comment by Stephen Senn on the so-called Fisher and Neyman null hypotheses. In an experiment with n participants (or, as we used to say, subjects or experimental units), the Fisher null hypothesis is that the treatment effect is exactly 0 for every one of the n units, while the Neyman null hypothesis is that the individual treatment effects can be negative or positive but have an average of zero.

Senn explains why Neyman’s hypothesis in general makes no sense—the short story is that Fisher’s hypothesis seems relevant in some problems (sometimes we really are studying effects that are zero or close enough for all practical purposes), whereas Neyman’s hypothesis just seems weird (it’s implausible that a bunch of nonzero effects would exactly cancel). And I remember a similar discussion as a student, many years ago, when Rubin talked about that silly Neyman null hypothesis.

Thinking about it more, though, I side with Neyman over Fisher, because the interesting problem for me is not testing the null hypothesis, which in nontrivial problems can never be true anyway, but in estimation. And in estimation I am intersted in an average effect, not an effect that is identical across all people. I could imagine a model in which the variance of the treatment effect is proportional to its mean—this would bridge between the Neyman and Fisher ideas—but this is not a model that anyone ever fits.

So, just to say it again: if it’s a pure null hypothesis, sure, go with Fisher. But if you’re inverting a family of hypothesis tests to get a confidence interval (something which I’d almost never want to do, but let’s go with this, since that’s the common application of these ideas), I’d go with Neyman, as it omits the implausible requirement that the treatment effect be exactly identical on all items.

If you look at the original post, you can read the comments, and even see what some people said about the anger management example.

As an aside, I’m surprised Gelman says he’d “almost never want to” invert a family of hypothesis tests to get a confidence interval, since, where possible, that is essentially what is done to use data to learn about magnitudes of discrepancy (that are and are not indicated), which I take him to be interested in.

I first coined the term blogolog here:

https://errorstatistics.com/2012/03/11/2717/

Just a curious side question: I went back to read the original anger management post, and I’m wondering: why do you say that you “find it hard to believe, however, that Fisher would have thrown some of Neyman’s wooden models onto the floor?”

Eileen: It just makes no sense, unless maybe Neyman had left his wooden models in the seminar room and Fisher was returning them. The idea of Fisher ransacking Neyman’s office is kooky enough–that he’d toss these fragile-looking models around, more so. If I’d known this issue would arise, I would have asked Lehmann when I read Reid. Maybe we’d need to ask a statistician’s shrink.

Fisher is known to have had a quick temper. Is he known for having expressed anger physically? (Hmm… “Banged the door.” …) Is he known to have done similar things to the property of other adversaries?

How plausible is it that persons unknown would have vandalized Neyman’s lab?

(No questions above are rhetorical.)

Corey: Thanks for these amusing questions, as my brain has been hammered by a logical-statistical issue for around 9 hours. I am registering a blank on Fisher and physical anger, assuming I’d ever heard of cases.

Should we really assume Neyman’s office was vandalized? Perhaps Reid was urging Neyman for more details on the big blowup and he just said they wondered one time if Fisher messed up his office?

By the way, I notice you didn’t call yourself Coreyshrink.

Mayo: Fisher did knock out a professional prize fighter who was working security at Cambridge, when (in his mind) they roughly treated one of his female research assistants. I did read the news paper article on it, though was unable to quickly find it on the web.

Throwing wooden models would be less of a big deal.

Keith: Well I’d be glad to hear that Fisher was protecting one of his female research assistants. But even if true (which I doubt without evidence), are we to suppose Neyman’s wooden models were threatening Fisher’s wooden models? Have you seen pictures of these models by the way?

Gelman states that he is intersted in an average effect when performing estimation; but a question I have is: how does he know that his inference is not immune to Simpson’s paradox, which implies that averages may not always be useful? Also, I’m not sure whether null hypotheses *never* could be true (in nontrivial problems), as one supposedly does not know the ‘true’ value of a parameter in advance in most realistic situations.

Nicole: I missed this. There is a fallacy invariably connected to claims that null hypotheses can never be true. With sufficient data, we may argue, they may be found false (i.e., SS results obtained), but that is different from being false. See what I mean?