I had been posting commentaries daily from January 6, 2022 (on my editorial “The Statistics Wars and Intellectual conflicts of Interest”, Conservation Biology) until Sir David Cox died on January 18, at which point I switched to some memorial items. These two commentaries from what Daniell calls my ‘birthday festschrift’ were left out, and I put them up now. (Links to others are below.)
Department of Philosophy, Logic & Scientific Method
London School of Economics & Political Science
Conflict, Analogy, and Deprojection
In the spirit of a birthday Festschrift for Prof. Deborah G. Mayo
Since I am about to comment upon an editorial on statistical methods in Conservation Biology, I will begin at the natural point-of-departure: Virgil’s Aeneid. This comment is only half tongue-in-cheek. Ostensibly, the story of the Aeneid is the story of a man who flees Troy to become the ancestor to the earliest Romans. But what it is, really, is a meditation on the nature of conflict and war. Indeed, its most controversial phrase to translate are its three first words
Arma virumque cano.
Robert Fagles came under fire when he chose to translate those three words
Wars and a man I sing
rather than its practically canonical English translation, made famous by the 17th century English poet, John Dryden
Of arms and the man I sing.
The meditation, especially starting in Book 7, concerns whether war is the inevitable consequence of human vanity or whether it is the result of the petty grievances of the gods. Either way, the conclusion appears to be that conflict is just part of human lives.
Of course, today, it is not scientifically respectable to wonder whether the gods are responsible for our conflicts concerning Bayesianism and frequentism (or even within frequentism which p-value constitutes a statistically significant figure). Prof. Mayo’s bold suggestion is that editors avoid this conflict by not taking sides. She argues that by taking sides, editors encourage the misuse of data analysis. When combined with the fact that scientists must publish in order to maintain and advance their careers, what results is experimental design which encourages the cherry picking of data and other selection malfeasance. Editors should remain islands, neutral Switzerlands landlocked in wartime.
Of course, this appears to conflict with Virgil’s apparent view that hostility and dispute are preordained as is taking sides, whether a Latin like Turnus or a Trojan like Aeneas. Let me provide some counterpoint to Professor Mayo’s contention that editors should not take sides.
The first point is perhaps partly anecdotal. My own experience has mostly been with molecular biologists, when I worked with my father in his lab as a teenager. I have learned over the years that biologists usually do not have deep training in statistics. In part, that is because much research in biology is qualitative. One might, for example, publish the sequence of a plasmid vector or make some qualitative judgments about the birdsong of North American woodpeckers. Certainly, I have never found a molecular biologist with whom I could talk at length about the Central Limit Theorem and how it figures into null hypothesis testing. In the absence of editorial guidance upon statistical standards, some scientists may find themselves at sea.
Second, editors are bound to enforce or encourage some epistemic norms. For example, in a serious epidemiological study, it will surely be insufficient if N=2 and the subjects are from the same nuclear family. Is the enforcement of other epistemic standards, such as the specification of an α-level or the requirement to use likelihood ratios really different in kind?
Third, and probably most importantly, what does avoiding conflict amount to? Even if editors do acquiesce to the request not to specify positions on probabilistic or statistical standard de jure in their official editorial guidelines, will they not inevitably take a position by choosing which articles to publish? That is, will they not inevitably engage in conflict by virtue of their role as editors, who separate the wheat from the chaff?
Admittedly, what these objections attack are something of a caricature of Mayo’s view, and there is no suggestion that any and all statistical guidance will generate scientific malpractice. However, the dividing line between prudent guidance and guidance which causes our baser instincts of ambition to overtake our better judgment is difficult to draw. Even the standard that editors should avoid taking positions of strained philosophy controversy is difficult to interpret, since we all know almost any question is a hot button for some philosopher.
Let me now add a rejoinder to my own counterpoint. I am sympathetic with Prof. Mayo’s view more than the above objections may suggest. Like Hempel and Oppenheim, I contend that scientific explanations consist in deductions of sort. What sort of deductive standard is involved in statistical arguments? Though statistics itself makes use of the predicate calculus through its use of arithmetic, algebra, and analysis, statistical arguments employ analogical logic. In particular, in reasoning statistically, we analogize from a sample to a population.
The trouble here, as Paul Bartha writes in the Stanford Encyclopedia of Philosophy’s article on Analogical Reasoning is that though such reasoning abounds, no one has come close to even presenting a plausible sound and complete logic of analogy. It does not merely permeate scientific reasoning
I argue in a forthcoming work, The Foundations of Microeoconomic Analysis (a preview which can be seen here), the reason is that analogical reasoning is situationally-specific. The nature of the similarity between an analogical model and its target defines which sort of inferences or deprojections are warranted epistemically.
Take for example the case of the chassis of a car modeled out of clay to calculate its drag coefficient. Because its shape will be exactly like the chassis of the production model, we can experiment with the clay model in order to make inferences about the airflow. We cannot, however, try to set it on fire, and then conclude the car is impossible to set aflame. Of course this is a contrived example, but it is the burden of author’s to convince us that the deprojections (i.e. the statistical and probabilistic standards) they have made are warranted given the similarity between the sample space and the population. Let the authors be the Latins and Trojans and let the editors watch like cool-headed Jupiter from above.
(Ph.D in Ecological Economics
Rensselaer Polytechnic Institute)
Some Economics of No-Threshold View
In the recent editorial article in Conservation Biology, Mayo (2021) emphasizes how philosophical presuppositions in statistics can lead to conflicts over journal policies, along with her consistent defense for the proper use of p-value in error control. The particular focus in this article is the adverse consequences of what she calls “no-threshold view,” which demands the restraint of the phrase statistical significance. I will discuss some economics about the no-threshold view and Mayo’s argument.
The incentive problem regarding p-value is a complicated one, and a blog article cannot fully describe the whole extent of the problem for general readers. The core component of the problem is straightforward though: information is incomplete for the reader of a research article. (See Dasgupta and David (1994) for general discussions about the reward system of science and incomplete information.) P-value provides information on research behaviors to the reader, but the research behavior is not directly observable. Due to its information value, p-value can be exploited by the researchers who conduct fraudulent and questionable research practices.
In my view, the no-threshold view tries to solve the problem by weakening the information value of p-value. It intends to make p-value a less attractive tool to manipulate and hence discourage manipulations. As any researcher keen on policy implications would know, when a policy is evaluated the effect needs to be evaluated for unintended effects considering multiple factors. Mayo’s argument on the consequences of the no-threshold view shows how the no-threshold view weakens the effect of intended legitimate uses as well.
Mayo writes, “If the reward structure is seducing even researchers who are aware of the pitfalls of capitalizing on selection biases, then one is dealing with a highly susceptible group.” The target population of the no-threshold view is this “highly susceptible group,” and about this part we do not seem to have sufficient empirical knowledge regarding this particular heterogeneity of population in the science community.
As a policy, the justification of the no-threshold view also depends on the effectiveness of other means to control the susceptible population. Statistical techniques can be developed to solve the problem of incomplete information. If there is a way to statistically detect p-value manipulation it may suffice to discourage the susceptible population. There is a potential that the problem can be fixed with statistical tools rather than institutional interventions. The problem with the no-threshold view is that it can disincentivize the development of such statistical tools. Mayo’s comment, “For a journal or organization to take sides in these long-standing controversies—or even to appear to do so—encourages groupthink and discourages practitioners from arriving at their own reflective conclusions about methods.” seems to express the same concern.
When incomplete information is the problem, the solution is often more information not less. When properly used, p-value and the phrase statistical significance delivers useful information. Rather than giving up on this information, it seems to be a better idea to give it a chance to improve. For the susceptible population, p-value is not the only tool to maneuver, and it is impossible to ban all methods. It is a defeatist attitude to give up on a valid tool.
Dasgupta, P., & David, P. A. (1994). Toward a new economics of science. Research policy, 23(5), 487-521.
Mayo D. G. (2021). The statistics wars and intellectual conflicts of interest. Conservation Biology, Published online December 6, 2021.
3 published commentaries and links to the Phil Stat Forum of January 11 are on my last blog post.
All of the initial blog commentaries on Mayo’s (2021) editorial (up through Jan 18, 2022) are below
Ionides and Ritov
I’m very grateful to all who wrote, and to Yoav Benjamini and David Hand for their presentations at the January 11 Phil Stat Forum on the topic: Statistical significance Test Anxiety.