We constantly hear that procedures of inference are inescapably subjective because of the latitude of human judgment as it bears on the collection, modeling, and interpretation of data. But this is seriously equivocal: Being the product of a human subject is hardly the same as being subjective, at least not in the sense we are speaking of—that is, as a threat to objective knowledge. Are all these arguments about the allegedly inevitable subjectivity of statistical methodology rooted in equivocations? I argue that they are!
Insofar as humans conduct science and draw inferences, it is obvious that human judgments and human measurements are involved. True enough, but too trivial an observation to help us distinguish among the different ways judgments should enter, and how, nevertheless, to avoid introducing bias and unwarranted inferences. The issue is not that a human is doing the measuring, but whether we can reliably use the thing being measured to find out about the world.
Remember the dirty-hands argument? In the early days of this blog (e.g., October 13, 16), I deliberately took up this argument as it arises in evidence-based policy because it offered a certain clarity that I knew we would need to come back to in considering general “arguments from discretion”. To abbreviate:
- Numerous human judgments go into specifying experiments, tests, and models.
- Because there is latitude and discretion in these specifications, they are “subjective.”
- Whether data are taken as evidence for a statistical hypothesis or model depends on these subjective methodological choices.
- Therefore, statistical inference and modeling is invariably subjective, if only in part.
We can spot the fallacy in the argument much as we did in the dirty hands argument about evidence-based policy. It is true, for example, that by employing a very insensitive test for detecting a positive discrepancy d’ from a 0 null, that the test has low probability of finding statistical significance even if a discrepancy as large as d’ exists. But that doesn’t prevent us from determining, objectively, that an insignificant difference from that test fails to warrant inferring evidence of a discrepancy less than d’.
Test specifications may well be a matter of personal interest and bias, but, given the choices made, whether or not an inference is warranted is not a matter of personal interest and desire. Setting up a test with low power against d’ might be a product of your desire not to find an effect for economic reasons, of insufficient funds to collect a larger sample, or of the inadvertent choice of a bureaucrat. Or ethical concerns may have entered. But none of this precludes our critical evaluation of what the resulting data do and do not indicate (about the question of interest). The critical task need not itself be a matter of economics, ethics, or what have you. Critical scrutiny of evidence reflects an interest all right—an interest in not being misled, an interest in finding out what the case is, and others of an epistemic nature.
Objectivity in statistical inference, and in science more generally, is a matter of being able to critically evaluate the warrant of any claim. This, in turn, is a matter of evaluating the extent to which we have avoided or controlled those specific flaws that could render the claim incorrect. If the inferential account cannot discern any flaws, performs the task poorly, or denies there can ever be errors, then it fails as an objective method of obtaining knowledge.
Consider a parallel with the problem of objectively interpreting observations: observations are always relative to the particular instrument or observation scheme employed. But we are often aware not only of the fact that observation schemes influence what we observe but also of how they influence observations and how much noise they are likely to produce so as to subtract them out. Hence, objective learning from observation is not a matter of getting free of arbitrary choices of instrument, but a matter of critically evaluating the extent of their influence to get at the underlying phenomenon.
For a similar analogy, the fact that my weight shows up as k pounds reflects the convention (in the United States) of using the pound as a unit of measurement on a particular type of scale. But given the convention of using this scale, whether or not my weight shows up as k pounds is a matter of how much I weigh!*
Likewise, the result of a statistical test is only partly determined by the specification of the tests (e.g., when a result counts as statistically significant); it is also determined by the underlying scientific phenomenon, at least as modeled. What enables objective learning to take place is the possibility of devising means for recognizing and effectively “subtracting out” the influence of test specifications, in order to learn about the underlying phenomenon, as modeled.
Focusing just on statistical inference, we can distinguish between an objective statistical inference, and an objective statistical method of inference. A specific statistical inference is objectively warranted, if it has passed a severe test; a statistical method is objective by being able to evaluate and control (at least approximately) the error probabilities needed for a severity appraisal. This also requires the method to communicate the information needed to conduct the error statistical evaluation (or report it as problematic).
It should be kept in mind that we are after the dual aims of severity and informativeness. Merely stating tautologies is to state objectively true claims, but they are not informative. But, it is vital to have a notion of objectivity, and we should stop feeling that we have to say, well there are objective and subjective elements in all methods; we cannot avoid dirty hands in discretionary choices of specification, so all inference methods do about as well when it comes to the criteria of objectivity. They do not.
*Which, in turn, is a matter of my having overeaten in London.