Posts Tagged With: meta-analysis

Stephen Senn: Also Smith and Jones

Stephen SennAlso Smith and Jones[1]
by Stephen Senn

Head of Competence Center for Methodology and Statistics (CCMS)


This story is based on a paradox proposed to me by Don Berry. I have my own opinion on this but I find that opinion boring and predictable. The opinion of others is much more interesting and so I am putting this up for others to interpret.

Two scientists working for a pharmaceutical company collaborate in designing and running a clinical trial known as CONFUSE (Clinical Outcomes in Neuropathic Fibromyalgia in US Elderly). One of them, Smith is going to start another programme of drug development in a little while. The other one, Jones, will just be working on the current project. The planned sample size is 6000 patients.

Smith says that he would like to look at the experiment after 3000 patients in order to make an important decision as regards his other project. As far as he is concerned that’s good enough.

Jones is horrified. She considers that for other reasons CONFUSE should continue to recruit all 6000 and that on no account should the trial be stopped early.

Smith say that he is simply going to look at the data to decide whether to initiate a trial in a similar product being studied in the other project he will be working on. The fact that he looks should not affect Jones’s analysis.

Jones is still very unhappy and points out that the integrity of her trial is being compromised.

Smith suggests that all that she needs to do is to state quite clearly in the protocol that the trial will proceed whatever the result of the interim administrative look and she should just write that this is so in the protocol. The fact that she states publicly that on no account will she claim significance based on the first 3000 alone will reassure everybody including the FDA. (In drug development circles, FDA stands for Finally Decisive Argument.)

However, Jones insists. She wants to know what Smith will do if the result after 3000 patients is not significant.

Smith replies that in that case he will not initiate the trial in the parallel project. It will suggest to him that it is not worth going ahead.

Jones wants to know suppose that the results for the first 3000 are not significant what will Smith do once the results of all 6000 are in.

Smith replies that, of course, in that case he will have a look. If (but it seems to him an unlikely situation) the results based on all 6000 will be significant, even though the results based on the first 3000 were not, he may well decide that the treatment works after all and initiate his alternative program, regretting, of course, the time that has been lost.

Jones points out that Smith will not be controlling his type I error rate by this procedure.

‘OK’, Says Smith, ‘to satisfy you I will use adjusted type I error rates. You, of course, don’t have to.’

The trial is run. Smith looks after 3000 patients and concludes the difference is not significant. The trial continues on its planned course. Jones looks after 6000 and concludes it is significant P=0.049. Smith looks after 6000 and concludes it is not significant, P=0.052. (A very similar thing happened in the famous TORCH study(1))

Shortly after the conclusion of the trial, Smith and Jones are head-hunted and leave the company.  The brief is taken over by new recruit Evans.

What does Evans have on her hands: a significant study or not?


1.  Calverley PM, Anderson JA, Celli B, Ferguson GT, Jenkins C, Jones PW, et al. Salmeterol and fluticasone propionate and survival in chronic obstructive pulmonary disease. The New England journal of medicine. 2007;356(8):775-89.

[1] Not to be confused with either Alias Smith and Jones nor even Alas Smith and Jones

Categories: Philosophy of Statistics, Statistics | Tags: , , , | 14 Comments

Stephen Senn: On the (ir)relevance of stopping rules in meta-analysis

Senn in China

Stephen Senn

Competence Centre for Methodology and Statistics
CRP Santé
Strassen, Luxembourg

George Barnard has had an important influence on the way I think about statistics. It was hearing him lecture in Aberdeen (I think) in the early 1980s (I think) on certain problems associated with Neyman confidence intervals that woke me to the problem of conditioning. Later as a result of a lecture he gave to the International Society of Clinical Biostatistics meeting in Innsbruck in 1988 we began a correspondence that carried on at irregular intervals until 2000. I continue to have reasons to be grateful for the patience an important and senior theoretical statistician showed to a junior and obscure applied one.

One of the things Barnard was adamant about was that you had to look at statistical problems with various spectacles. This is what I propose to do here, taking as an example meta-analysis. Suppose that it is the case that a meta-analyst is faced with a number of trials in a given field and that these trials have been carried out sequentially. In fact, to make the problem both simpler and more acute, suppose that no stopping rule adjustments have been made. Suppose, unrealistically, that each trial has identical planned maximum size but that a single interim analysis is carried out after a fraction f of information has been collected. For simplicity we suppose this fraction f to be the same for every trial. The questions is ‘should the meta-analyst ignore the stopping rule employed’? The answer is ‘yes’ or ‘no’ depending on how (s)he combines the information and, interestingly, this is not a question of whether the meta-analyst is Bayesian or not. Continue reading

Categories: Philosophy of Statistics, Statistics | Tags: , , , | 2 Comments

G. Cumming Response: The New Statistics

Prof. Geoff Cumming [i] has taken up my invite to respond to “Do CIs Avoid Fallacies of Tests? Reforming the Reformers” (May 17th), reposted today as well. (I extend the same invite to anyone I comment on, whether it be in the form of a comment or full post).   He reviews some of the complaints against p-values and significance tests, but he has not here responded to the particular challenge I raise: to show how his appeals to CIs avoid the fallacies and weakness of significance tests. The May 17 post focuses on the fallacy of rejection; the one from June 2, on the fallacy of acceptance. In each case, one needs to supplement his CIs with something along the lines of the testing scrutiny offered by SEV. At the same time, a SEV assessment avoids the much-lampooned uses of p-values–or so I have argued. He does allude to a subsequent post, so perhaps he will address these issues there.

The New Statistics

PROFESSOR GEOFF CUMMING [ii] (submitted June 13, 2012)

I’m new to this blog—what a trove of riches! I’m prompted to respond by Deborah Mayo’s typically insightful post of 17 May 2012, in which she discussed one-sided tests and referred to my discussion of one-sided CIs (Cumming, 2012, pp 109-113). A central issue is:

Cumming (quoted by Mayo): as usual, the estimation approach is better

Mayo: Is it?

Lots to discuss there. In this first post I’ll outline the big picture as I see it.

‘The New Statistics’ refers to effect sizes, confidence intervals, and meta-analysis, which, of course, are not themselves new. But using them, and relying on them as the basis for interpretation, would be new for most researchers in a wide range of disciplines—that for decades have relied on null hypothesis significance testing (NHST). My basic argument for the new statistics rather than NHST is summarised in a brief magazine article ( and radio talk ( The website has information about the book (Cumming, 2012) and ESCI software, which is a free download.

Continue reading

Categories: Statistics | Tags: , , , , , , , | 5 Comments

Blog at