S. Senn: Fishing for fakes with Fisher (Guest Post)



Stephen Senn
Head of  Competence Center
for Methodology and Statistics (CCMS)
Luxembourg Institute of Health
Twitter @stephensenn

Fishing for fakes with Fisher

 Stephen Senn

The essential fact governing our analysis is that the errors due to soil heterogeneity will be divided by a good experiment into two portions. The first, which is to be made as large as possible, will be completely eliminated, by the arrangement of the experiment, from the experimental comparisons, and will be as carefully eliminated in the statistical laboratory from the estimate of error. As to the remainder, which cannot be treated in this way, no attempt will be made to eliminate it in the field, but, on the contrary, it will be carefully randomised so as to provide a valid estimate of the errors to which the experiment is in fact liable. R. A. Fisher, The Design of Experiments, (Fisher 1990) section 28.

Fraudian analysis?

John Carlisle must be a man endowed with exceptional energy and determination. A recent paper of his is entitled, ‘Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals,’ (Carlisle 2017) and has created quite a stir. The journals examined include the Journal of the American Medical Association and the New England Journal of Medicine. What Carlisle did was examine 29,789 variables using 72,261 means to see if they were ‘consistent with random sampling’ (by which, I suppose, he means ‘randomisation’). The papers chosen had to report either standard deviations or standard errors of the mean. P-values as measures of balance or lack of it were then calculated using each of three methods and the method that gave the value closest to 0.5 was chosen. For a given trial the P-values chosen were then back-converted to z-scores combined by summing them and then re-converted back to P-values using a method that assumes the summed Z-scores to be independent. As Carlisle writes, ‘All p values were one-sided and inverted, such that dissimilar means generated p values near 1’. Continue reading

Categories: Fisher, RCTs, Stephen Senn | 5 Comments

Stephen Senn: Randomization, ratios and rationality: rescuing the randomized clinical trial from its critics


Stephen Senn
Head of Competence Center for Methodology and Statistics (CCMS)
Luxembourg Institute of Health

This post first appeared here. An issue sometimes raised about randomized clinical trials is the problem of indefinitely many confounders. This, for example is what John Worrall has to say:

Even if there is only a small probability that an individual factor is unbalanced, given that there are indefinitely many possible confounding factors, then it would seem to follow that the probability that there is some factor on which the two groups are unbalanced (when remember randomly constructed) might for all anyone knows be high. (Worrall J. What evidence is evidence-based medicine? Philosophy of Science 2002; 69: S316-S330: see p. S324 )

It seems to me, however, that this overlooks four matters. The first is that it is not indefinitely many variables we are interested in but only one, albeit one we can’t measure perfectly. This variable can be called ‘outcome’. We wish to see to what extent the difference observed in outcome between groups is compatible with the idea that chance alone explains it. The indefinitely many covariates can help us predict outcome but they are only of interest to the extent that they do so. However, although we can’t measure the difference we would have seen in outcome between groups in the absence of treatment, we can measure how much it varies within groups (where the variation cannot be due to differences between treatments). Thus we can say a great deal about random variation to the extent that group membership is indeed random. Continue reading

Categories: RCTs, S. Senn, Statistics | Tags: , | 6 Comments

Stephen Senn: Indefinite irrelevance

Stephen SennStephen Senn
Head, Methodology and Statistics Group,
Competence Center for Methodology and Statistics (CCMS),

At a workshop on randomisation I attended recently I was depressed to hear what I regard as hackneyed untruths treated as if they were important objections. One of these is that of indefinitely many confounders. The argument goes that although randomisation may make it probable that some confounders are reasonably balanced between the arms, since there are indefinitely many of these, the chance that at least some are badly confounded is so great as to make the procedure useless.

This argument is wrong for several related reasons. The first is to do with the fact that the total effect of these indefinitely many confounders is bounded. This means that the argument put forward is analogously false to one in which it were claimed that the infinite series ½, ¼,⅛ …. did not sum to a limit because there were infinitely many terms. The fact is that the outcome value one wishes to analyse poses a limit on the possible influence of the covariates. Suppose that we were able to measure a number of covariates on a set of patients prior to randomisation (in fact this is usually not possible but that does not matter here). Now construct principle components, C1, C2… .. based on these covariates. We suppose that each of these predict to a greater or lesser extent the outcome, Y  (say).  In a linear model we could put coefficients on these components, k1, k2… (say). However one is not free to postulate anything at all by way of values for these coefficients, since it has to be the case for any set of m such coefficients that inequality (2)where  V(  ) indicates variance of. Thus variation in outcome bounds variation in prediction. This total variation in outcome has to be shared between the predictors and the more predictors you postulate there are, the less on average the influence per predictor.

The second error is to ignore the fact that statistical inference does not proceed on the basis of signal alone but also on noise. It is the ratio of these that is important. If there are indefinitely many predictors then there is no reason to suppose that their influence on the variation between treatment groups will be bigger than their variation within groups and both of these are used to make the inference. Continue reading

Categories: RCTs, Statistics, Stephen Senn | 15 Comments

RCTs, skeptics, and evidence-based policy

Senn’s post led me to investigate some links to Ben Goldacre (author of “Bad Science” and “Bad Pharma”) and the “Behavioral Insights Team” in the UK.  The BIT was “set up in July 2010 with a remit to find innovative ways of encouraging, enabling and supporting people to make better choices for themselves. A BIT blog is here”. A promoter of evidence-based public policy, Goldacre is not quite the scientific skeptic one might have imagined. What do readers think?  (The following is a link from Goldacre’s Jan. 6 blog.)

Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials

‘Test, Learn, Adapt’ is a paper which the Behavioural Insights Team* is publishing in collaboration with Ben Goldacre, author of Bad Science, and David Torgerson, Director of the University of York Trials Unit. The paper argues that Randomised Controlled Trials (RCTs), which are now widely used in medicine, international development, and internet-based businesses, should be used much more extensively in public policy.
 …The introduction of a randomly assigned control group enables you to compare the effectiveness of new interventions against what would have happened if you had changed nothing. RCTs are the best way of determining whether a policy or intervention is working. We believe that policymakers should begin using them much more systematically. Continue reading

Categories: RCTs, Statistics | Tags: | 4 Comments

Blog at WordPress.com.