Professor Stephen Senn*
Full Paper: Bad JAMA?
Short version–Opinion Article: Misunderstanding publication bias
The student undertaking a course in statistical inference may be left with the impression that what is important is the fundamental business of the statistical framework employed: should one be Bayesian or frequentist, for example? Where does one stand as regards the likelihood principle and so forth? Or it may be that these philosophical issues are not covered but that a great deal of time is spent on the technical details, for example, depending on framework, various properties of estimators, how to apply the method of maximum likelihood, or, how to implement Markov chain Monte Carlo methods and check for chain convergence. However much of this work will take place in a (mainly) theoretical kingdom one might name simple-random-sample-dom.
In fact, there are many real data-sets that have various features that do not accord well with this set up. For example, there is a huge variety of data generating and data filtering mechanisms any applied statistician has to deal with. The standard example chosen for statistical inferences courses, a simple random sample, is almost never encountered in practice.
I was struck by the way in which data-filtering can mislead by reading Ben Goldacre’s Bad Pharma recently. I must declare an interest, and since this would otherwise take up most of this blog, I will do so with an URL. Here it is http://www.senns.demon.co.uk/Declaration_Interest.htm . Amongst claims that Goldacre makes are first, that authors are less likely to submit negative papers for publication, although second, journals have no prejudice against negative papers and in favour of positive ones.
Since authors are frequently reviewers, this combination of bias as regards one activity and not as regards another is implausible and, in my opinion, Goldacre is making the classic error of assuming that the data that arise at the point of study have not suffered any filtering before being seen. However, if authors have a bias against negative and in favour of positive papers, then reviewers cannot be getting to see a comparable sample of each. If authors were submitting by probability of acceptance, then we might well expect to see similar probabilities of acceptance for negative and positive studies but higher quality of the former. If a study was negative, authors, anticipating editorial attitudes, would not submit unless the study was of excellent quality. Thus identical probabilities of acceptance would be evidence of a bias just as, for example, statistics that showed women candidates for promotion were as successful on average as men, would be evidence for bias if we also found that the women were on average much better qualified. See http://f1000research.com/2012/12/11/positively-negative-are-editors-to-blame-for-publication-bias/ for a discussion.
Such data filters can cause problems for naïve approaches to inferences. Here are some notorious cases.
- A study found that Oscar winners lived longer than actors who did not win an Oscar. Does esteem lead to long life?
- Obese infarct survivors have better prognosis than the non-obese. Does obesity protect against further heart damage?
- Women in the village of Wickham were asked if they smoked or not. Twenty years later a higher percentage of non-smokers had died than smokers. Does smoking help you live longer?
- The average age at death of right-handers was found to be much greater than left-handers. Is there a sinister effect on mortality?
- In a trial comparing three non-steroidal anti-inflammatory drugs, patients were stratified by concomitant aspirin use (yes or no). Aspiring takers had a much higher rate of cardiovascular events. Is aspirin bad for your heart?
However, my favourite story regarding this concerns Abraham Wald. Faced with a distribution of bullet holes in returning planes Wald stunned his military colleagues by claiming that it was important to put armour where there were no bullet-holes since those were precisely the planes that did not get back.
Sometimes what you don’t see is more important than what you do and theoretical statistics courses do not always prepare you for this.
Click on picture for a video presentation:
*Head of the Competence Centre for Methodology and Statistics
Love the stories of Wald and the biased editors! They perfectly demonstrate what Taleb calls the narrative fallacy or “ignoring the graveyard”. Thanks much for this post.