I want to shift to the arena of testing the adequacy of statistical models and misspecification testing (leading up to articles by Aris Spanos, Andrew Gelman, and David Hendry). But first, a couple of informal, philosophical mini-posts, if only to clarify terms we will need (each has a mini test at the end).
1. How do we obtain Knowledge, and how can we get more of it?
Few people doubt that science is successful and that it makes progress. This remains true for the philosopher of science, despite her tendency to skepticism. By contrast, most of us think we know a lot of things, and that science is one of our best ways of acquiring knowledge. But how do we justify our lack of skepticism? Any adequate account of the success of science has to square with the fact of limited data, with unobserved and unobservable phenomena, with theories underdetermined by data, and with the slings and arrows of the threat of error. Intent on supplying such an account, I’m drawn not toward some ideal form of knowledge “deep down,” nor toward some perfectly rational agent, but simply toward illuminating how in fact we get the kinds of knowledge we manage to obtain—and how to get more of it!
As such, the centrally relevant question is: How do we learn about the world despite threats of error?
2. Inductive inference as “Evidence Transcending”
The risk of error enters because we want to find things out—make claims or take action—based on limited information. When we move beyond the data to claims that are “evidence transcending,”the argument, strictly speaking, is inductive. The premises can be true while the conclusion inferred may be false—without a logical contradiction.Conceiving of inductive inference, very generally, as “evidence-transcending”or “ampliative” reasoning frees us to talk about induction without presupposing certain special forms it can take. Notably, while mathematical probability arises in inductive inference, there are two rival positions on its role:
· to quantify the degree of confidence, belief, or support to assignto a hypothesis or claim (given data x); and
· to quantify how reliably probed, well-tested, or corroborated a claim is (given data x).
Note: Feb. 2014: I now distinguish a better-known third category under this umbrella: to quantify the long run error rates of a method. (performance). Tools with good performance can be used for probativeness, but tools with good performance do not automatically serve for this end. Even when they do, a different justification and interepretation is required. I had often assumed that, in science at least, people understood that tools with good long-run performance can serve this probative goal, but I’d temporarily overlooked how much some people take literally crude long-run behavior justifications.
This contrast is at the heart of a philosophical scrutiny of statistical accounts.
Confusion about induction and the threat of the so-called philosophical problem of induction have made some people afraid to use the word—even statisticians, who could in fact be teaching philosophers about it. Such fears, however, are unwarranted, once it is properly understood. But more important, even those who claim to restrict themselves to variations on “deductive” falsification must warrant their premises empirically, as that arch falsificationist Karl Popper knew only too well.
3. Popper, Probabilism and Severe Tests
Since Popper keeps popping up in the statistical literature (e.g., in Stephen Senn, Andrew Gelman), let me try without philosophical fanfare to say something about him. In one sense Popper was a skeptic: he didn’t think we could justify hypotheses as either true or probably true—he rejected “probabilism” (also called “inductivism,” which is confusing). Yet he still thought science was successful and that there were rational methods of science.
Consider the two (3) views of probability just given. The first view, that probability arises to assign degrees of belief, truth, or support to hypotheses, goes hand in hand with the conception that claims are warranted by being either true or probable. Popperians also call this probabilism or justificationism, and we can retain those terms. When it is said that Popper was an anti-inductivist, what is really meant is that he rejected probabilism. He did not reject the idea that it was possible to warrant evidence-transcending claims. Instead, he required that evidence-transcending claims be accepted (or preferred, or inferred) only if they have been subjected to, and have passed, stringent tests. Probability, accordingly, arose in the second sense. Here, probability is best seen as characterizing the properties of a method or rule, which we may call a method of testing.
For example, a good testing rule might be one that with high probability would falsify a hypothesis H if false, but not otherwise. For Popper, a hypothesis was well corroborated only to the extent that it passed a severe attempt to falsify it.
4. The Wedge between Skepticism and Irrationalism
This focus on using probability to qualify testing methods (rather than claims) is the key that allowed Popper to “drive a wedge between skepticism and irrationalism”. (e.g. Musgrave 1999, p. 322). We can be skeptics about inductively inferring that H is probable—we can reject probabilism outright—while still allowing warranted inferences to H. We need only have rational testing methods or rules.
We can allow
It is warranted to infer H
to be open to including any number of epistemological stances: It is warranted to infer (believe, accept, prefer, act in accordance with) H.
Popperians oust probabilism but retain rationality by defining rationality as following a rational method. A rational method is one that infers (or claims to have evidence for)
H only to the extent that
H has passed a severe test—a test which would have, with reasonably high probability, detected a flaw in
H, were it present.
[i]
Popper often viewed hypotheses as “solving problems” –where the problem could be construed very generally. So we might say that an irrational method is one that would, with high probability, declare our problem solved, when in fact it was not (or not solved to a degree specified). Such a method would to readily declare the problem solved erroneously.
Popperians escape the need to come up with a logic of (or methods for) confirmation—but they still need methods for severe tests, and ways to evaluate tests on their error-probing ability. Can this be achieved? Stay tuned.
_______________________________________________
Mini-Test:
- What are the two uses of probability in inference?
- How does Popper drive a “wedge” between skepticism and irrationality?
- What is probabilism (or justificationism)?
- What would be an irrational method for solving a problem?
See parts 2, and 3.
____________________________________________
Lakatos, I. 1978. The Methodology of Scientific Research Programmes. Edited by J. Worrall and G. Currie. Vol. 1 of Philosophical Papers. CUP.
Mayo, D. 1996. Error and the Growth of Experimental Knowledge. Chicago: Universityof Chicago Press.
Mayo, D. 2011. “Statistical Scienceand Philosophy of Science: Where Do/Should They Meet in 2011 (and Beyond)?” Rationality, Markets and Morals (RMM) Vol. 2: 79-109.
Musgrave, A. 1999. Essays inRealism and Rationalism. Amsterdam: Rodopi; Atlanta, GA.
Popper,K. 1959. The Logic of ScientificDiscovery. New York: Basic Books.
Popper,K. 1983. Realism and the aim of Science. NJ: Rowman and Littlefield.
“Mere supporting instances are as a rule too cheap to be worth having…any support capable of carrying weight can only rest upon ingenious tests, undertaken with the aim of refuting our hypothesis, if it can be refuted (Popper 1983, 30).
Among the synonymous terms ‘accept’, ‘infer’, ‘believe’, ‘prefer’, ‘act in accordance with’, the first three have informal connotations that make them vulnerable to misuse. Similarly, replacing ‘induction’ with ‘conjecture’ makes your case with greater clarity, especially for a wider audience. ‘Conjectural preference’ is indeed evidence-transcending, whereas one can expect ‘inductive inference’ to generate confusion.
Thanks Paul. I have said much, much more about these different terms elsewhere, and I realize the great danger in trying to take on something like this in a tiny blogpost, when the issue actually demands a lot of nuance. I’m doing it, nevertheless. Not out of recklessness, but of a need to communicate and avoid some common confusions, and set the stage for a clear use of perfectly ordinary terms (including terms Popperians might want us to banish!) When I make inductive inferences from severe tests, they will not be well captured by calling them conjectural preferences, which in any event is comparative (unlike severity). All I can say just now is to look at my longer books and papers for the nuance and related arguments (at least 3 or 4 are specifically on Popper).