In the Barnard letter in reply to you:

“Andrew’s self-description as a card-carrying Bayesian prompts me to ask whether he routinely points out to his clients that none of the posterior probability statements he might suggest they make need be acceptable to anyone else, though they may share the client’s model and his data. Should he do so, I would be interested in the reactions especially of those who may have to deal with committees on the safety of medicines.”

Good for George!

I still have a few stories about George beyond those mentioned for example in “Stat on a Hot Tin Roof” when he questioned my quote of Fisher, alluding to “Russian 5-year plans” in berating Neyman. I was like 30 yrs old.

We corresponded for some years (he was quite interested in philosophy of science/statistics), and I regret that he discouraged me (and my husband) from coming out to see him in Colchester when he was too ill to attend my Lakatos talk in London in March 1999.

In the Barnard letters in reply to you:

“Andrew’s self-description as a card-carrying Bayesian prompts me to ask whether he routinely points out to his clients that none of the posterior probability statements he might suggest they make need be acceptable to anyone else, though they may share the client’s model and his data. Should he do so, I would be interested in the reactions especially of those who may have to deal with committees on the safety of medicines.”

Good for George!

I still have a few stories about George beyond those mentioned for example in “Stat on a Hot Tin Roof” when he questioned my quote of Fisher, alluding to “Russian 5-year plans” in berating Neyman. I was like 30 yrs old.

We corresponded for some years (he was quite interested in philosophy of science/statistics), and I regret that he discouraged me (and my husband) from coming out to see him in Colchester when he was too ill to attend my Lakatos talk in London in March 1999.

In 1986 I was fortunate enough with my Ciba-Geigy colleagues Amy Racine and Hugo Flűhler and my future PhD supervisor Adrian Smith to present a read paper to the Royal Statistical Society entitled “Bayesian methods in practice: experience in the pharmaceutical industry”. At the time it was the practice that the presenters of a read paper were invited to a dinner after the presentation and discussion by the Statistical Dinner Club. The tradition begun in 1839. The club still exists but no longer performs the same role. RA Fisher

I was seated opposite George Barnard at the dinner and he asked me why I had become interested in applying Bayesian approaches to pharmaceutical problems. One of the topics covered in our paper was estimation of the median lethal dose, the LD50, in animal toxicology studies. I told him that my motivation had begun because in many practical problems, including the estimation of the LD50, traditional approaches often did not provide sensible solutions. I then quoted RA Fisher.

Fisher had provided an appendix to a paper by Chester Bliss in 1935 giving for the first time a methods for determining the maximum likelihood estimates of the parameters of a probit model. Bliss had told the story of Fisher developing the method to account for those groups with 0 and 100 percent deaths because “When a biologist believes there is information in an observation, it is up to the statistician to get it out”. (The story is reminiscent of Feynman’s conclusion during the Challenger Disaster Commission that one shouldn’t ignore temperature data from shuttle flights in which there were no problems with the O-rings. That data were relevant rather like Sherlock Holmes’ “The Dog That Didn’t Bark”.)

George’s response was “Young man don’t quote Fisher at me!!”

My second interaction with George was in the correspondence columns of the Royal Statistical Society’s News & Notes during 1992. The exchange began with a self-important letter of mine responding to an Opinion piece by a future colleague, Nigel Smeeton, on Confidence Intervals.

AP Grieve Letter to the Editor of RSS News & Notes. July 1992. Confidence Intervals

Would Nigel Smeeton have us believe that we are likely to have greater success in explaining the concepts behind confidence intervals (CI) to clinicians than we have had in explaining the concepts that lie behind hypothesis tests and p-values? It is 43 years since the recognised beginning of the modern era in clinical statistics; 43 years since Bradford Hill was successful in introducing Fisherian ideas into medical research, yet 43 years in which clinicians have still not grasped that a p-value does not represent the probability that the null hypothesis is true. Is it our fault as a profession for not explaining, is it their fault for not understanding, or is it the fault of the concepts themselves?

The mental gymnastics necessary when the classical definition of a CI is accepted, is magnificently illustrated by an editorial in the Annals of Internal Medicine which deals with an estimate of the proportion of patients showing complete response to treatment for ovarian cancer (LE Braitman, Annals of Internal Medicine, 108, 296-298):

“. . the proper interpretation of confidence intervals requires that we consider a large number of hypothetical random samples (each of the same size). Then “95% confidence” means that approximately 95% of the 95% confidence intervals from these random samples would include the unknown true value, and about 5 % would not. Because the true fraction in the population is unknown, it is impossible to tell if the 95 % confidence interval of 28% to 55% that was obtained from the observed sample data actually included the true fraction. Strictly speaking, we cannot even tell how likely the 95% confidence interval of 28% to 55% is to include the unknown true fraction. Nevertheless, the usual interpretation is that we are 95 % confident that the unknown true value is between 28%) and 55%”.

Magnificent. But surely not logical. The clear conclusions which I draw from this passage, and I would not claim to be the first, are that the probability statements involved in CI’s are statements concerning the procedure of calculating the intervals and that the “usual interpretation”, although not supported by the CI procedure is precisely the form of statement that users of CI’s would like to make. Herein lies the problem. Classical statistics provides inferential statements which are, in my view, not in the form in which scientists wish to have them and it therefore seems to me that it is the concepts themselves which are at fault, as are we for not putting forward alternative concepts, which are available, and which do meet their needs. From my perspective, the recent campaign to supplant the p-value and to replace it by CI’s has been conducted in a conceptual vacuum. Whether we can successfully persuade clinicians to change their habits and to use CI‘s in preference to p-values, and I believe the campaign has been a success, begs the question as to whether we should. If we are unable to educate clinicians, then merely persuading them to use CI’s rather than p-values is to replace the unthinking use of one technique with that of another. Indeed, it is not at all clear that we will have achieved anything since one of the items of campaign propaganda was to point out that a CI was an inverted propaganda recognised that the p-value itself contains information not contained in the CI so that both are necessary.

Being a card-carrying Bayesian I view these machinations with detached amusement. But should I?

AP Grieve

ICI Pharmaceuticals

Alderley Park

In the following edition of News & Notes George teased me and told me off.

GA Barnard Letter to the Editor of RSS News & Notes. August 1992. Confidence Intervals

Andrew Grieve may have missed a sale for the ICI book on Statistical Methods in Research and Production. He could have recommended that his Annals of Internal Medicine editorial writer read section 4.1 where, in the middle of page 59, he would find the terse, clear, and accurate sentence: “The limits (x ) ̅±3σ/√n are known as the 99.7 percent confidence limits for µ, and the confidence coefficient of 99.7 percent reflects the fact that, of every thousand such assertions we make, only three, on average, will be incorrect.” Confusing references to repeated random samples of the same size are quite unnecessary.

Andrew’s self-description as a card-carrying Bayesian prompts me to ask whether he routinely points out to his clients that none of the posterior probability statements he might suggest they make need be acceptable to anyone else, though they may share the client’s model and his data. Should he do so, I would be interested in the reactions especially of those who may have to deal with committees on the safety of medicines.

As the very grateful dedicatee of a book by a bevy of Bayesians I suppose I might call myself an honorary Bayesian. There are problems where we cannot do without Bayesian assumptions. In such cases we do well to bear in mind Student’ s view, expressed in a letter to Fisher dated 3rd April 1922: “When I was in the lab in 1907 I tried to work out variants of Bayes with a priori probabilities other than G=C [Editorial note: this means a uniform prior] but I soon convinced myself that with ordinary sized samples one’ s a priori hypothesis made a fool of the actual sample… and since then have refused to use any other hypothesis than the one that leads to your likelihood… Then each piece of evidence can be considered on its own merits.”

George A Barnard

Brightlingsea

In my reply I was able to use the same the source as support for my views.

AP Grieve Letter to the Editor of RSS News & Notes. September 1992. Confidence Intervals

George Barnard mildly admonishes me for not having read page 59 of the ICI publication Statistical Methods in Research and Production and for failing to recommend the confidence interval (CI) definition to be found there. I admit it. Unfortunately, I have to admit that I did not read page 81 either. Had I done so I would have found support for my position in the statement that:

‘. . there is an essential incongruity in attempting to apply frequency-ratio concepts of probability to the outcome of unique events; any probability measure in such circumstances can only describe the strength of belief, or the confidence with which we are prepared to make a particular assertion.’

We could of course argue about whether a clinical trial is a unique event. One might, for example, wish to imbed an individual trial in a series of trials with the same treatment and say therefore that it is not a unique event. Such a series of trials could form the basis of a meta-analysis/overview of the treatment. Within which series of meta-analyses would one wish to imbed that particular meta-analysis for the purpose of making probability statements?

Further support to my position is given by this extract from a footnote on page 81 where the authors are commenting on the indistinguishability of the CI for a normal mean from an integrated likelihood

approach, that is Bayes with an improper, uniform prior:

‘In this instance, however, it was also possible to attach a frequency ratio interpretation to the confidence coefficient by considering as the “event” , the making of an assertion and not the occurrence of a particular value in connection with any one assertion.’

Again emphasising that probability statements associated with CI’s concern the calculation procedure and not the particular results.

The second issue which George Barnard raises is crucial. David Spiegelhalter and Laurence Freedman have identified three groups of individuals, each with their own motivations, who interact during the complex development process which culminates in the implementation of a new medical treatment. They term these groups the experimenters, the reviewers and the consumers. The objective of the experimenters, among whom are pharmaceutical companies and research organisations, is to influence the consumers, who are the clinicians. They do this by providing them with information which is “sanitised” to ensure objectivity by the reviewers, who are the journal editors and regulatory authorities, whom Sir David Cox has called the “last holders of absolute power”. The statistician’s job is not over when the last analysis, Bayesian or not, is performed since consideration has to be given to the transmission of information to these different groups of “remote clients.”

The problem is to determine what is the appropriate method of transmitting information to remote clients. This issue is not new, in fact the term ”remote clients” comes from the title of a 1963 Econometrica paper by Clifford Hildreth in which he examines the difficulty of transmitting information to vaguely known clients, whose use of the information may extend long after the statistician’s work has been completed. He considers what parcels of information can be efficiently transmitted to remote clients and lists a number, among which are the data, the likelihood and posterior distributions for a series of representative prior distributions. I personally lean towards the last of these three parcels, but it may be that we need to consider providing more than one of the parcels. Indeed, at the LSE meeting on Ethical and Methodological Issues in Clinical Trials last year, I was surprised by the degree of unanimity among Bayesians and frequentists to the suggestion that in journal articles reporting on clinical trials the Results section should contain the data, or the likelihood, and that the Discussion was the proper place for the posterior analysis.

As far as what I would say to my clients goes, I think that it should be pointed out to them that they are not the only clients of the analysis, and that other more remote clients, with different perspectives and motivations, may well interpret the results differently, indeed it would be surprising if they did not. But I do not believe that it is solely a problem for a Bayesian.

AP Grieve

ICI Pharmaceuticals

Macclesfield

My third interaction with George was at the last meeting of the ICI Mathematics and Statistics Panel, shortly after the above exchange, in November 1992. The ICI Panel had produced Statistical Methods in Research and Production, referred to above and had run for many years until the biological divisions of ICI demerged to form Zeneca in the early 1990s. I sat next to George at lunch and we had an interesting exchange on Bayes, Fisher, confidence, p-values and ….

In none of these exchanges did George talk down to me, nor denigrate my ideas. He was charming, instructive, supportive, everything one could wish from a senior member of the profession and teacher.

]]>The Peircean point would be that we cannot escape the abductive factor in statistical inference or scientific inquiry generally.

]]>