Objectivity in statistics, as in science more generally, is a matter of both aims and methods. Objective science, in our view, aims to find out what is the case as regards aspects of the world [that hold] independently of our beliefs, biases and interests; thus objective methods aim for the critical control of inference and hypotheses, constraining them by evidence and checks of error. (Cox and Mayo 2010, p. 276)
I. The myth of objectivity. Whenever you come up against blanket slogans such as “no methods are objective” or “all methods are equally objective and subjective,” it is a good guess that the problem is being trivialized into oblivion. Yes, there are judgments, disagreements, and values in any human activity, which alone makes it too trivial an observation to distinguish among very different ways that threats of bias and unwarranted inferences may be controlled. Is the objectivity-subjectivity distinction really toothless as many will have you believe? I say no.
Cavalier attitudes toward objectivity are in tension with widely endorsed movements to promote replication, reproducibility, and to come clean on a number of sources behind illicit results: multiple testing, cherry picking, failed assumptions, researcher latitude, publication bias and so on. The moves to take back science–if they are not mere lip-service–are rooted in the supposition that we can more objectively scrutinize results,even if it’s only to point out those that are poorly tested. The fact that the term “objectivity” is used equivocally should not be taken as grounds to oust it, but rather to engage in the difficult work of identifying what there is in “objectivity” that we won’t give up, and shouldn’t.
II. The Key is Getting Pushback. While knowledge gaps leave plenty of room for biases, arbitrariness and wishful thinking, we regularly come up against data that thwart our expectations and disagree with the predictions we try to foist upon the world. We get pushback! This supplies objective constraints on which our critical capacity is built. Our ability to recognize when data fail to match anticipations affords the opportunity to systematically improve our orientation. In an adequate account of statistical inference, explicit attention is paid to communicating results to set the stage for others to check, debate, extend or refute the inferences reached. Don’t let anyone say you can’t hold them to an objective account of statistical inference.
If you really want to find something out, and have had some experience with flaws and foibles, you deliberately arrange inquiries so as to capitalize on pushback, on effects that will not go away, and on strategies to get errors to ramify quickly to force you to pay attention to them. The ability to register alterations in error probabilities due to hunting, optional stopping, and other questionable research practices (QRPs) is a crucial part of objectivity in statistics. In statistical design, day-to-day tricks of the trade to combat bias are amplified and made systematic. It is not because of a “disinterested stance” that such methods are invented. It is that we, competitively and self-interestedly, want to find things out.
Admittedly, that desire won’t suffice to incentivize objective scrutiny if you can do just as well producing junk. Succeeding in scientific learning is very different from success at grants, honors, publications, or engaging in technical activism, replication research and meta-research. That’s why the reward structure of science is so often blamed nowadays. New incentives, gold stars and badges for sharing data, preregistration, and resisting the urge to cherry pick, outcome-switch, or otherwise engage in bad science are proposed. I say that if the allure of carrots has grown stronger than the sticks (which they have), then what we need are stronger sticks.
III. Objective procedures. It is often urged that, however much we may aim at objective constraints, we can never have clean hands, free of the influence of beliefs and interests. The fact that my background knowledge enters in researching a claim H doesn’t mean I combinine my beliefs about H into the analysis so as to prejudge any inference. I may instead use background information to give H a hard time. I may use it to question your claim to have grounds to infer H by showing it hasn’t survived a stringent effort to falsify or find flaws in H. The test H survived might be quite lousy, and even if I have independent grounds to believe H, I may deny you’ve done a good job testing it.
Others argue that we invariably sully methods of inquiry by the entry of personal judgments in their specification and interpretation. It’s just human all too human. The issue is not that a human is doing the measuring; the issue is whether that which is being measured is something we can reliably use to solve some problem of inquiry. That an inference is done by machine, untouched by human hands, wouldn’t make it objective, in the relevant sense. There are three distinct requirements for an objective procedure for solving problems of inquiry:
- Relevance: It should be relevant to learning about the intended topic of inquiry; having an uncontroversial way to measure something doesn’t make it relevant to solving a knowledge-based problem of inquiry.
- Reliably capable: It should not routinely declare the problem solved when it is not solved (or solved incorrectly); it should be capable of controlling the reliability of erroneous reports of purported answers to question.
- Capacity to learn from error: If the problem is not solved (or poorly solved) at a given stage, the method should set the stage for pinpointing why. (It should be able to at least embark on an inquiry for solving “Duhemian problems” of where to lay blame for anomalies.)
Yes, there are numerous choices in collecting, analyzing, modeling, and drawing inferences from data, and there is often disagreement about how they should be made. Why suppose this means all accounts are in the same boat as regards subjective factors? It need not, and they are not. An account of inference shows itself to be objective precisely in how it steps up to the plate in handling potential threats to objectivity.
IV. Idols of Objectivity. We should reject phony objectivity and false trappings of objectivity. They often grow out of one or another philosophical conception of what objectivity requires—even though you will almost surely not see them described that way. If it’s thought objectivity is limited to direct observations (whatever they are) plus mathematics and logic, as does the typical logical positivist, then it’s no surprise to wind up worshiping “the idols of a universal method” as Gigerenzer and Marewski (2015) call it. Such a method is to supply a formal, ideally mechanical, way to process statements of observations and hypotheses. To recognize such mechanical rules don’t exist is not to relinquish the view that they’re demanded by objectivity. Instead, objectivity goes by the board, replaced by various stripes of relativism and constructivism, or more extreme forms of post-modernisms.
Relativists may augment their rather thin gruel with a pseudo-objectivity arising from social or political negotiation, cost-benefits (“they’re buying it”), or a type of consensus (“it’s in a 5 star journal”), but that’s to give away the goat far too soon. The result is to abandon the core stipulations of scientific objectivity. To be clear: There are authentic problems that threaten objectivity. We shouldn’t allow outdated philosophical accounts to induce us into giving it up.
V. From Discretion to Subjective Probabilities. Some argue that “discretionary choices” in tests, which Neyman himself tended to call “subjective”, leads us to subjective probabilities in claims. A weak version goes: since you can’t avoid subjective (discretionary) choices in getting the data and the model, there can be little ground for complaint about subjective degrees of belief in the resulting inference. This is weaker than arguing you must use subjective probabilities; it argues merely that doing so is no worse than discretion. But it still misses the point.
Even if the entry of discretionary judgments in the journey to a statistical inference/model have the capability to introduce subjectivity, they need not. Second, not all discretionary judgments are in the same boat when it comes to being open to severe testing.
A stronger version of the argument goes on a slippery slope from the premise of discretion in data generation and modeling to the conclusion: statistical inference just is a matter of subjective beliefs (or their updates). How does that work? One variant, which I do not try to pin on anyone in particular, involves a subtle slide from “our models are merely objects of belief”, to “statistical inference is a matter of degrees of belief”. From there it’s a short step to “statistical inference is a matter of subjective probability” (whether my assignments or that of an imaginary omniscient agent).
It is one thing to describe our models as objects of belief and quite another to maintain that our task is to model beliefs.
This is one of those philosophical puzzles of language that might set some people’s eyes rolling. If I believe in the deflection effect (of gravity) then that effect is the object of my belief, but only in the sense that my belief is about said effect. Yet if I’m inquiring into the deflection effect, I’m not inquiring into beliefs about the effect. The philosopher of science Clark Glymour (2010, p. 335) calls this a shift from phenomena (content) to epiphenomena (degrees of belief).
Karl Popper argues that the central confusion all along was sliding from the degree of the rationality (or warrantedness) of a belief, to the degree of rational belief (1959, p. 424). The former is assessed via degrees of corroboration and well-testedness, rooted in the error probing capacities of procedures. (These are supplied by error probabilities of methods, formal or informal.)
VI. Blurring What’s Being Measured vs My Ability to Test It. You will sometimes hear a Bayesian claim that anyone who says their probability assignments to hypotheses are subjective must also call the use of any model subjective because it too is based on my choice of specifications. This is a confusion of two notions of subjective.
- The first concerns what’s being measured, and for the Bayesian, with some exceptions, probability is supposed to represent a subject’s strength of belief (be it actual or rational), betting odds, or the like.
- The second sense of subjective concerns whether the measurement is checkable or testable.
This goes back to my point about what’s required for a feature to be relevant to a method’s objectivity in III.
(Passages, modified, are from Mayo, Statistical Inference as Severe Testing (forthcoming)
But he never would allow subjective probabilities to enter in statistical inference. Objective, i.e., frequentist, priors in a hypothesis H could enter, but he was very clear that this required H’s truth being the result of some kind of stochastic mechanism. He found that idea plausible in cases, the problem was not knowing the stochastic mechanism sufficiently to assign the priors. Such frequentist (or “empirical”) priors in hypotheses are not given by drawing Hrandomly from an urn of hypothesis k% of which are assumed to be true. Yet, an “objective” Bayesian like Jim Berger will call these frequentist, resulting in enormous confusion in today’s guidebooks on the probability of type 1 errors.
Cox D. R. and Mayo. D. G. (2010). “Objectivity and Conditionality in Frequentist Inference” in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D Mayo and A. Spanos eds.), Cambridge: Cambridge University Press: 276-304.
Gigerenzer, G. and Marewski, J. 2015. ‘Surrogate Science: The Idol of a Universal Method for Scientific Inference,’ Journal of Management 41(2): 421-40.
Glymour, C. 2010. ‘Explanation and Truth’, in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D. Mayo and A. Spanos eds.), CUP: 331–350.
Mayo, D. (1983). “An Objective Theory of Statistical Testing.” Synthese 57(2): 297-340.
Popper, K. 1959. The Logic of Scientific Discovery. New York: Basic Books.