**Stephen Senn**

*Head of Competence Center for Methodology and Statistics (CCMS)*

*Luxembourg Institute of Health*

*This post first appeared here.* An issue sometimes raised about randomized clinical trials is the problem of indefinitely many confounders. This, for example is what John Worrall has to say:

Even if there is only a small probability that an individual factor is unbalanced, given that there are indefinitely many possible confounding factors, then it would seem to follow that the probability that there is some factor on which the two groups are unbalanced (when remember randomly constructed) might for all anyone knows be high. (Worrall J. What evidence is evidence-based medicine?

Philosophy of Science2002;69: S316-S330: see p. S324 )

It seems to me, however, that this overlooks four matters. The first is that it is not indefinitely many variables we are interested in but only one, albeit one we can’t measure perfectly. This variable can be called ‘outcome’. We wish to see to what extent the difference observed in outcome between groups is compatible with the idea that chance alone explains it. The indefinitely many covariates can help us predict outcome but they are only of interest to the extent that they do so. However, although we can’t measure the difference we would have seen in outcome between groups in the absence of treatment, we can measure how much it varies within groups (where the variation cannot be due to differences between treatments). Thus we can say a great deal about random variation to the extent that group membership is indeed random.

The second point is that in the absence of a treatment effect, where randomization has taken place, the statistical theory predicts probabilistically how the variation in outcome between groups relates to the variation within.The third point, strongly related to the other two, is that statistical inference in clinical trials proceeds using ratios. The F statistic produced from Fisher’s famous analysis of variance is the *ratio* of the variance between to the variance within and calculated using observed outcomes. (The ratio form is due to Snedecor but Fisher’s approach using semi-differences of natural logarithms is equivalent.) The critics of randomization are talking about the effect of the unmeasured covariates on the *numerator* of this ratio. However, any factor that could be imbalanced *between* groups could vary strongly *within* and thus while the numerator would be affected, so would the denominator. Any Bayesian will soon come to the conclusion that, given randomization, coherence imposes strong constraints on the degree to which one expects an unknown something to inflate the numerator (which implies not only differing between groups but also, coincidentally, having predictive strength) but not the denominator.

The final point is that statistical inferences are probabilistic: either about statistics in the frequentist mode or about parameters in the Bayesian mode. Many strong predictors varying from patient to patient will tend to inflate the variance within groups; this will be reflected in due turn in wider confidence intervals for the estimated treatment effect. It is not enough to attack the estimate. Being a statistician means never having to say you are certain. It is not the estimate that has to be attacked to prove a statistician a liar, it is the certainty with which the estimate has been expressed. We don’t call a man a liar who claims that with probability one half you will get one head in two tosses of a coin just because you might get two tails.