See (Part 2)
See (Part 1)
7. How the story turns out (not well)
This conception of testing, which Lakatos called “sophisticated methodological falsificationism,” takes us quite a distance from the more familiar if hackneyed conception of Popper as a simple falsificationist.[i] It called for warranting a host of different methodological rules for each of the steps along the way in order to either falsify or corroborate hypotheses. But it doesn’t end well. They had no clue how to warrant such rules as reliable for their tasks. As for Duhem’s problem, Popperians generally assume it is never really solvable (at least in any interesting scientific tests), but that testing is always done within a large-scale theory or paradigm. The “arrow of modus tollens”, in this view, is always directed at the cluster of theories (or paradigm, or disciplinary matrix, or large scale theory) containing the primary hypothesis H and the auxiliary hypotheses and background theories. It is further imagined that the paradigm stipulates which parts of the background theories and auxiliaries to attempt to blame if you get into trouble with anomalies, and which portions of the theory must be retained at all costs (the hard core). Even though one might reconstruct episodes in the history of science along these lines, the account fails to provide forward-looking tools for finding things out. It’s all just a matter of “rational reconstruction” of intuitively sound scientific episodes (another one of Lakatos’ terms).
What about the problem of inferring you have a genuine anomaly (a falsifying hypothesis)? This too is left at a very unsatisfactory level, e.g., the anomaly is real if it will not go away. However, Popper himself thought that thanks to hypotheses that stand up to severe tests, “we can be reasonably successful in attributing our refutations to definite portions of the theoretical maze. (“For we are reasonably successful in this—a fact which must remain inexplicable for one who adopts Duhem’s and Quine’s view on the matter.”) (C & R, 1963, 242). But that does not mean he supplied an adequate account for warranting this important fact. He did not.
Due to his own “deductivist” language, and the logical empiricist assumptions about theory testing at the time, Popper remained caught up in tangles of language, and was unable to cash out required notions. For an example of the former, Popperians say things like: it is warranted to infer (prefer or believe) H (because H has passed a severe test), but there is no justification for H (because “justifying” H would mean H was true or highly probable). For an example of the latter, Popperians will say a hypothesis H must be subjected to severe tests, where severe tests are defined as tests that would with high probability falsify H, if H is false, but then have no answer when asked: how can you warrant saying that a given test is or would be severe (i.e., that it has a high probability of finding flaws in H, if they exist)? At most Popper could illustrate with intuitive examples of tests thought to have poor probing power: e.g., tests that do not require novel predictions, but can at most account for known effects. That is fine, so far as it goes, but intuitive illustrations do not suffice.
8. Roles for Statistics
Yet it is precisely at these points, in solving the problems of sophisticated methodological falsificationism (not that I would use that name), that statistical methods can and do enter. Statistical methods can warrant inferences about genuine, systematic effects, or, alternatively, warrant inferring that all systematic statistical information has been adequately captured by a hypothesized model (Spanos 2007). Statistical methods and models provide roomy niches in between substantive scientific theories, models and hypotheses on the one hand, and a host of more local and intermediate questions, where the effects of different factors may be modeled, probed, and distinguished. Had Popper made use of the statistical testing ideas being developed around the same time (and around the corner), he might have been able to substantiate his account of methodological falsiﬁcation and justified his recognition that we do manage to solve our Duhemian problems in practice.[ii] Statistical models and methods are excellent examples of how we succeed in both inferring genuine anomalies (falsifying hypothesis) and pinpointing their source. The key, however, is the opposite of holism, testing within a paradigm, or any of the “largisms” that have entered the philosophical scene, but rather to entirely local, piecemeal inquiries, split off from the questions they may later serve to answer. We don’t have to affirm each auxiliary hypothesis, it suffices to distinguish their effects, and/or subtract them out afterwards (see Mayo 1996, EGEK, chapter 1).
9. Jettison the traditional way of formulating “Duhem’s problem”
Let me be clear that I do not advocate that statisticians go back to Popper to obtain illumination for their problems of evidence and inference. On the contrary, I am rather disappointed that philosophers have not, by and large, offered more realistic replacements for testing accounts that are mostly stuck in a logical empiricist time-warp. I happen to like Popper’s work (Peirce is better)–only because I can “translate him”–, but I never would be discussing Popper here were I not rather surprised to see him coming up in the writing of statisticians (of various inclinations). I want to spare them some dead-ends, but most importantly, get us over some straight-jacketed ways of talking that have trickled down from logical empiricism. (Ironically, contemporary philosophers of science, especially in the U.S., almost never make use of Popper.)
Furthermore, I argue that we should reject the pattern of argument associated with Duhem’s problem in the first place (that first premise: If H & A1 &…and & An then O). If you think about it, I think you’ll agree: Scientists, in any interesting test, do not try to form a conjunction of background theories and auxiliary hypotheses to derive a particular data set. (Think of something like going from the general theory of relativity (GTR) to predictions of the timing data in a particular pair of binary pulsars.) Duhemian problems are real, but the very way of putting the problem is silly and has caused much mischief. But that’s an issue for another time.
Note: I’ve said all this much more clearly and fully in published works, most of which are available through this page. For a very quick overview on this blog—but skip the first few paragraphs on how I couldn’t get my key to work in London), one source is Nov. 5, 2011.
Mayo, D. (2006). “Critical Rationalism and Its Failure to Withstand Critical Scrutiny,” in C. Cheyne and J. Worrall (eds.) Rationality and Reality: Conversations with Alan Musgrave, Kluwer Series Studies in the History and Philosophy of Science, Springer: The Netherlands: 63-99.
Popper, K. (1963). Conjectures and Refutations: The Growth of Scientific Knowledge, Routledge: London, New York.
Spanos, A. (2007) “Curve-Fitting, the Reliability of Inductive Inference and the Error-Statistical Approach,” Philosophy of Science 74(5): 1046-1066.
[i] That Lakatos departed still further from Popper does not mean that we need to go there as well.
[ii] I really do have a letter from Popper telling me he regrets never having learned modern statistics. This is when I sent him a 1990 paper on severity, asking whether this was not really what he had meant. I have an earlier letter, less interesting.