Would you agree if your (senior) colleague urged you to use his/her book rather than your own –even if you thought doing so would change for the positive the entire history of your field? My guess is that the answer is no (see “add on”). For that matter, would you ever try to insist that your (junior) colleague use your book in teaching a course rather than his/her own notes or book? Again I guess no. But perhaps you’d be more tactful than were Fisher and Neyman. It wasn’t just Fisher (whose birthday is tomorrow) who seemed to need some anger management training, Erich Lehmann (in conversation and in 2011) points to a number of incidences wherein Neyman is the instigator of gratuitous ill-will. Their substantive statistical and philosophical disagreements, I now think, were minuscule in comparison to the huge animosity that developed over many years. Here’s how Neyman describes a vivid recollection he has of the 1935 book episode to Constance Reid (1998, 126). [i]

A couple of months “after Neyman criticized Fisher’s concept of the complex experiment” Neyman vividly recollects Fisher stopping by his office at University College on his way to a meeting which was to decide on Neyman’s reappointment[ii]:

“And he said to me that he and I are in the same building… . That, as I know, he has published a book—and that’s

Statistical Methods for Research Workers—and he is upstairs from me so he knows something about my lectures—that from time to time I mention his ideas, this and that—and that this would be quite appropriate if I were not here in the College but, say, in California—but if I am going to be at University College, this this is not acceptable to him. And then I said, ‘Do you mean that if I am here, I should just lecture using your book?’ And then he gave an affirmative answer. And I said, ‘Sorry, no. I cannot promise that.’ And then he said, ‘Well, if so, then from now on I shall oppose you in all my capacities.’ And then he enumerated—member of the Royal Society and so forth. There were quite a few. Then he left. Banged the door.”

Imagine if Neyman had replied:

“I’d be very pleased to use

Statistical Methods for Research Workers in my class, what else?”

Or what if Fisher had said:

“Of course you’ll want to use your own notes in your class, but I hope you will use a portion of my text when mentioning some of its key ideas.”

*Very unlikely [iii]. *

How would you have handled it?

Ironically, Neyman did something very similar to Erich Lehmann at Berkeley, and blocked his teaching graduate statistics after one attempt that may have veered slightly off Neyman’s path. But Lehmann always emphasized that, unlike Fisher, Neyman never created professional obstacles for him. [iv]

“add on”: From the earlier discussion, I realized a needed qualification:the answer would have to depend on whether your ideas on the subject were substantially different from the colleague’s. For instance if Neyman were being asked by Lindley, it would be very different.

[i] At the meeting that followed this exchange, Fisher tried to shoot down Neyman’s reappointment, but did not succeed (Reid, 125).

[ii]This is Neyman’s narrative to Reid. I’m sure Fisher would relate these same episodes differently. Let me know if you have any historical material to add. I met Lehmann for the first time shortly after he had worked with Reid on her book, and he had lots of stories. I should have written them all down at the time.

[iii] I find it hard to believe, however, that Fisher would have thrown some of Neyman’s wooden models onto the floor:

“ After the Royal Statistical Society meeting of March 28, relations between workers on the two floors of K.P.’s old preserve became openly hostile. One evening, late that spring, Neyman and Pearson returned to their department after dinner to do some work. Entering they were startled to find strewn on the floor the wooden models which Neyman had used to illustrate his talk on the relative advantages of randomized blocks and Latin squares. They were regularly kept in a cupboard in the laboratory. Both Neyman and Pearson always believed that the models were removed by Fisher in a fit anger.” (Reid 124, noted in Lehmann 2011, p. 59. K.P. is, of course, Karl Pearson.)

[iv] I didn’t want to relate this anecdote without a citation, and finally found one in Reid (215-16). Actually I would have anyway, since Lehmann separately told it to Spanos and me.

Lehmann, E. (2011). *Fisher, Neyman and the Creation of Classical Statistics*, Springer.

Reid, C (1998), Neyman., *Springer*

This is a good, thought-provoking post in a couple of ways.

First, what would I actually have done? I suspect that I would have agreed to use his book–Fisher has a bad reputation for bullying and I’m pretty sure that he would have intimidated me–but I would certainly have made editorial comments during the lectures to make clear to the students that there is more than one way to think about the material. I don’t know about UC back then, but nowadays at my university the probability of a senior colleague being present at any of my lectures is close to zero.

Second, and this may be the main point of the post, it illustrates the contingent aspects of development of ideas and theories. If the main characters had been better behaved then we would almost certainly be in a different world. The K Pearson/Fisher/Neyman/E Pearson story is a ripper, and it deserves to be more often presented in statistics textbooks. God knows that the topic is widely regarded as dry and uninspiring, so an injection of drama would help.

Michael: Thanks for the thoughtful response. Let me start with your second point because actually I was contemplating writing a Saturday night “statistical theater of the absurd” in which Fisher and Neyman run into each other in the Elysian Fields or wherever and begin to admit the absurdity of some of their polemics. Fisher would admit that while Neyman was behavioristic in theory, he (Fisher) was actually more behavioristic in practice, etc. etc. and they’d have a great big laugh over the extreme catcalls and sparring over the years. Surely they must have known (deep down) that they were exaggerating their differences to a laughable extent often. I mean, to write “The Silver Jubilee of my Disagreement with Fisher”—give me a break! (Let me be clear that, unlike my colleague Aris Spanos, I’m no historian of statistics.) Anyway,I think a theatrical production would work better than a textbook write-up. I knew that while it would be fun to write such a thing, it would take me away from my book which has to be completed soon. I will say something on your first point in a separate comment.

Michael: I like your suggestion for how you might handle the situation, and it is the route I’d be tempted to take—in Neyman’s shoes. But this made me realize an equivocal aspect to my post. The 1935 book episode came up in talking to Aris Spanos yesterday, so I had the idea to put it in a post, but I tossed in the queries (what would you do?) only as I was writing. I think the answer would have to depend on whether your ideas on the subject were substantially different from the colleague’s. For instance if Neyman were being asked by Lindley, it would be very different. Throughout my career, a great many inducements for conversions were thrown my way, but I always felt that I had to develop my own philosophy of science and philosophy of statistics. (When I came to Virginia Tech, I.J. Good urged me to be a “Doogian”, which seemed murky enough, but later he (often) said he was quite let down that I never went that route.) So that kind of case is different, and I take it your answer reflected Neyman’s situation , as I intended. I mean Neyman had been building/formalizing Fisher’s basic ideas at that stage….

Maybe a solution would be to invite the senior colleague to present a lecture or two?

E.B. yes that might have been a good solution, and possibly an improvement over some of Neyman’s classes, according to what I’ve heard.

I think it was inappropriate of Fisher to ask Neyman to lecture from Statistical Methods for Research Workers (if that is what happened) although I also think that Neyman had a lot to learn (at that time and later) from that book. However the date here is important. In March 1935 Neyman read a paper on “Statistical problems in Agricultural Experimentation” to the Royal Statistical Society.

In this paper, Neyman claimed, amongst other things that Fisher’s analysis of Latin Squares was not right. His attack was very formal. His conclusions were wrong. Fisher replied as follows:

‘…it was only about a year since another academic mathematician from abroad had been as much excited about having proved that the Latin Square was mathematically exact, as Dr Neyman seemed to be at having proved it inaccurate.’

The mathematician concerned was almost certainly Samuel Wilks (1906-1964) who provided a proof of the validity of Fisher’s analysis of Latin Squares he considered Fisher had failed to supply. It was William Cochran who was at Cambridge for a while with Wilks who was able to persuade him of the validity of Fisher’s proof.

Fisher was an extremely deep thinker and, as Jimmy Savage later realised, one whose very great mathematical powers were underestimated by many younger mathematicians because he did not employ the fashionable formalism that was spreading from France, Russia and Germany and that (mainly from the latter) was to conquer America.

If (but this is speculation on my part) Neyman was teaching students at UCL that Fisher’s approach to Latin Squares was wrong, (an assertion that was itself wrong) then this was an open provocation and bound to lead to trouble.

See my paper Senn, S. J. (2004). “Added Values: Controversies concerning randomization and additivity in clinical trials.” Statistics in Medicine 23(24): 3729-3753, for a discussion.

Stephen: Thanks, and I will look up your paper. I did mention that this was “a couple of months ‘after Neyman criticized Fisher’s concept of the complex experiment’.” So are you saying Neyman was wrong in that particular criticism of Fisher? I didn’t understand the back and forth on that one. Too dizzying. Anything you can add would be of value.

Deborah: Thanks. Yes, Neyman was wrong. Obviously so, in my opinion and an interpretation as to why is that he underestimated Fisher and proceeded in a rush to try and demonstrate that he could do better without pausing to understand what Fisher meant. As such Fisher had every right to be annoyed but may well have overreacted. This is my explanation from the paper I cited:

“The dispute between Fisher and Neyman can be seen to have its origins in the choice of model. It has been claimed that this was not obvious to Fisher [15] but I disagree. I think Fisher understood perfectly well what Neyman’s argument was but rejected it as being foolish. In words, Fisher’s null hypothesis can be described as being, ‘all treatments are equal’, whereas Neyman’s is, ‘on average all treatments are equal’. The first hypothesis necessarily implies the second but the converse is not true. Neyman developed a model in which on average over the field, the yields of dierent treatments could be the same (if the null hypothesis were true) but they could actually differ on given plots. Although it seems that this is more general than Fisher’s null hypothesis it is, in fact, not sensible. Anyone who doubts this should imagine

themselves faced with the following task: it is known that Fisher’s null hypothesis is false and the treatments are not identical; find a field for which Neyman’s hypothesis is true.”

It is interesting to note that Wilks, who was also very mathematical in the formal sense, had previously assumed that Fisher was right and provided what he considered was the missing proof. Thus Fisher was in the annoying position(a pity perhaps he did not find it amusing) of having been provided with ‘superior’ attacks on a problem he had solved: one attack showing him wrong and the other showing him to be right. Perhaps he should have invited Neyman and Wilks to slug it out.

Thanks Stephen. Yes, on the same page (59) that the 1935 book incident comes up in Lehmann (2011), he does note this difference in models used. I guess this seemed to me to settle it though. Since my main stat books are now within arm’s length of my “book-writing chair”, I just reached over to the early Neyman papers. From my (utter) outsider’s view*, it appears that the distinction is elaborated in meticulous detail in the paper and numerous comments, and even Neyman says he doubts the possible bias could be more than slight; and then there’s E. Pearson defending Neyman as just trying to extend Fisher’s ideas in the only way possible: by seeing how some assumption could fail. And on the face of it, I’m not sure why someone would “need to find a field for which Neyman’s hypothesis is true”.

But really, I have only the vaguest grasp of any of this (I do appreciate your note for historical correctness here), and am mainly commenting because as I opened up that paper I saw what must be Neyman’s wooden models (174-5) that Fisher is alleged to have tossed on the floor! They’re really fantastic! Now we’d at most have computer images. They are much more intriguing to me (artistically) than this “momentous” dispute, which surely shows radically misplaced priorities on my part. Yet now that I hear you have written about it, I give it more weight.

*of latin square manure studies

Deborah: ‘find a field for which Neyman’s hypothesis is true’ means simply this, ‘can you think of a realistic circumstance in which although the treatments are not identical they are identical on average?’. Let me give you an even simpler example. Suppose that you had two treatments being compared in a clinical trial. Now you did not believe the treatment to be the same (in the sense that you think that it could make a difference to a given patient whether he or she received A or B) but you believed they could be the same on average for these patients (which is to say over all randomisations). Isn’t it extraordinary that although you think that if only you could (but you can’t) test each patient with A and (counterfactually) with B or vice versa you could construct a series of values A-B that were not equal to zero, it nevertheless was the case that for these patients the sum of the negative differences exactly wiped out the sum of the positive differences so that the total was zero? Neyman’s hypothesis is exactly of this sort. Fisher simply takes the point of view that if the treatments are different they are also different on average. Neyman takes the point of view that there is no reason to believe they are the same but we may plausibly believe they are so on average. But if that is the case if he had just failed to recruit one of the patients he did recruit in the trial his average would no no longer be zero. Try construct an example and you will see what I mean.

Here are some counterfactual differences (A-B) that satisfy Neyman’s hypothesis for six patients: -5, 2, 3, -1, 2, -1. So Fisher’s hypothesis is wrong since, in fact, none of the differences are zero but Neyman’s is right since the total (and hence the average) is zero. But suppose that patient 6 had not been recruited, then Neyman’s hypothesis would be wrong or suppose that patient 7 had been recruited, unless this patient had a value of 0 then Neyman’s hypothesis would also be wrong. Neyman is just doing something really weird that makes no sense. (Admittedly hypothesis testing is itself rather strange but that’s another story.)

ARis: Senn’s point reminds me of something we spoke about a awhile ago, perhaps it is only similar by way of analogy. Anyway, I’d be curious to learn what you thought of his point here, at some stage.

In connection with this example, I am posting, anonymously, a comment I received:

I could imagine a model in which the variance of the treatment effect is proportional to its mean–this would be a bridge between the Neyman and Fisher ideas–but this is not a model that anyone ever fits.

Actually such models are fitted – a standard trick is to log transform the data and an example of how this works is given in my “Added Values” paper previously cited. More complex approaches for simultaneously modelling mean and variance were developed by Youngjo Lee and John Nelder(1).

However, as Fisher was quick to point you can’t have variances differing according to the mean if the means are the same, which is what they are supposed to be when the null hypothesis is true. Thus (to use Neyman’s vocabulary, which Fisher would not have) this is a matter for the alternative hypothesis.

1 Lee, Y. and J. A. Nelder (2006). “Double hierarchical generalized linear models.” Journal of the Royal Statistical Society Series C-Applied Statistics 55: 139-167.

This story is amusing to me because when I taught at Berkeley, the department chair told me he wouldn’t trust me to teach applied statistics to the Ph.D. students. He seemed to have a very clear idea of what applied statistics was, and what I did wasn’t part of it!

Andrew: Do you think it was because of an experimental/observational modeling distinction?

Mayo:

No. When I was there, none of the applied statistics people there (including myself) did randomized experiments. We only analyzed observational data. The problem with my colleagues was that they had 2 levels of snob attitude. First, they didn’t respect social science. Second, they didn’t respect Bayesian statistics. Actually, they didn’t even respect probability modeling except in biology.

They also thought everything I did was trivial. They might well have been correct on that last point. Often applied statistics is mathematically trivial. Even when I was fitting a differential equation model, it wasn’t a particularly complicated differential equation.

Hmm…well they must have wanted you if they hired you, and presumably for applied not theoretical statistics. What do I know? Did you interact with Lehmann?

Mayo:

Lehmann was very pleasant to me personally, but by the time I was there, he was already retired (or essentially so) and he little influence. David Blackwell was also very supportive. The department seemed to have a real herd mentality. Hearing about it all, twenty years later, one might think that if David Blackwell and Erich Lehmann were supportive, that would count for a lot. But the mainstream of theoretical statisticians and probabilists in the department seemed to have made a collective decision from Day 1 that I was not to be taken seriously. I remember trying to talk about my work in Bayesian Data Analysis with Lucien Le Cam and he was very patronizing. Applied multilevel modeling was trivial to him, he’d done it all in the 1940s. I think that for all these people, if they’d collectively decided I was worth listening too, they could’ve liked my work.

And, yes, they hired me, but I suspect the department was divided on my case. I remember that, right after they hired me but before I arrived at Berkeley, they reneged on some of the start-up costs they’d promised me. It was no big deal, I did fine there, but it was a little weird. I was off-balance my whole time there. I think that maybe they hired me because they felt they had to, as I was the strongest candidate. But weird things kept happening, for example when the department chair told me he wouldn’t trust me to teach applied statistics to the Ph.D. students. And he was one of my supporters!

I have some sympathy with Andrew. I have had some excellent interactions with mathematicians and mathematical statisticians over the years.I often maintain that you can divide biostatisticians into two sorts, those who have a chip on their shoulders because not physicians and those who have a chip because not mathematicians and I freely admit to belonging to the latter. I also always maintain that every statistician should wish (s)he knew more mathematics. Nevertheless I have had frustrating interactions with some mathematicians because their snobbery has prevented their treating what I do seriously. A piece I wrote on this some years ago gives examples of the problem. See below.

“Senn, S. J. (1998). “Mathematics: governess or handmaiden?” Journal of the Royal Statistical Society Series D-The Statistician 47(2): 251-259.

Mathematics is essential to statistics. Nevertheless, overemphasizing the mathematical aspects of a statistical problem can cause the statistician to lose sight of important issues and distort the process of statistical reasoning. Mathematical manipulation from model to solution is only one link in a chain of statistical reasoning and statistics must be concerned with the strength of the chain as a whole.”

Stephen:

On balance, being treated badly by my Berkeley colleagues was good for me. Setting aside all personal benefits gained from my move to New York, it was good for me to be disrespected for 6 years, as it gave me more respect for pluralism. All my life I’d been treated like a very special flower (with the only exception being my four weeks in the U.S. Mathematical Olympiad training program, where I learned that there were at least 15 or 20 kids in the country who were better at math than I was, and I resolved that difficulty by resolving not to become a mathematician when I grew up). In Berkeley, I was in this weird situation where all over the world I was widely respected, but in the Berkeley campus I was disdained. It’s not even like they went around arguing with me; it’s more that they treated me like a nobody. Their attitude was that I had nothing interesting to say. Of course this bothered me (otherwise I wouldn’t still be writing about it!) but it made me realize I should be careful before dismissing the work of others. If there’s something I don’t respect so much that I can’t bother to learn about it, fine. But I should say just that, rather than feigning an authority I do not possess. I have always tried in my writings to be as direct as possible, saying exactly where I’m coming from even when being negative. Much of this is a reaction to what I perceived others were not doing in my case.

So, in my situation, it wasn’t really math vs. non-math but rather that, as I saw it, a consensus had emerged not to carefully look at what I’d been doing. Even one of my strongest supporters at Berkeley indicated to me at one point that he had no particular respect for what I was doing. Even though he supported my tenure case, he’d absorbed the consensus. (Of course, maybe they were correct about the (lack of) importance of my research. But in that case I believe the appropriate response would for them to have published articles explaining why this much-cited work was wrong, or derivative, or unimportant–rather than just expressing their opinions in unpublished internal documents.)

Andrew: Thanks for these thoughts.

A metablog query: I hope that others can tell when a new comment is up (I have to be logged in), else comments on older posts might be missed. Is there something I can do on the blog, or do people have to sign up for a “new comment” alert. I have now included the comments widget on the blog, but don’t always.