Phil 6334* Day #4: Mayo slides follow the comments below. (Make-up for Feb 13 snow day.) Popper reading is from Conjectures and Refutations.
As is typical in rereading any deep philosopher, I discover (or rediscover) different morsals of clues to understanding—whether fully intended by the philosopher or a byproduct of their other insights, and a more contemporary reading. So it is with Popper. A couple of key ideas to emerge from Monday’s (make-up) class and the seminar discussion (my slides are below):
- Unlike the “naïve” empiricists of the day, Popper recognized that observations are not just given unproblematically, but also require an interpretation, an interest, a point of view, a problem. What came first, a hypothesis or an observation? Another hypothesis, if only at a lower level, says Popper. He draws the contrast with Wittgenstein’s “verificationism”. In typical positivist style, the verificationist sees observations as the given “atoms,” and other knowledge is built up out of truth functional operations on those atoms.[1] However, scientific generalizations beyond the given observations cannot be so deduced, hence the traditional philosophical problem of induction isn’t solvable. One is left trying to build a formal “inductive logic” (generally deductive affairs, ironically) that is thought to capture intuitions about scientific inference (a largely degenerating program). The formal probabilists, as well as philosophical Bayesianism, may be seen as descendants of the logical positivists–instrumentalists, verificationists, operationalists (and the corresponding “isms”). So understanding Popper throws a lot of light on current day philosophy of probability and statistics.
- The fact that observations must be interpreted opens the door to interpretations that prejudge the construal of data. With enough interpretive latitude, anything (or practically anything) that is observed can be interpreted as in sync with a general claim H. (Once you opened your eyes, you see confirmations everywhere, as with a gestalt conversion, as Popper put it.) For Popper, positive instances of a general claim H, i.e., observations that agree with or “fit” H, do not even count as evidence for H if virtually any result could be interpreted as according with H.
Note a modification of Popper here: Instead of putting the “riskiness” on H itself, it is the method of assessment or testing that bears the burden of showing that something (ideally quite a lot) has been done in order to scrutinize the way the data were interpreted (to avoid “verification bias”). The scrutiny needs to ensure that it would be difficult (rather than easy) to get an accordance between data x and H (as strong as the one obtained) if H were false (or specifiably flawed).
Note the second modification of Popper that goes along with the first: It isn’t that GTR opened itself to literal “refutation” (as Popper says), because even if true, a positive result could scarcely be said to follow, or even to have been expected in 1919 (or long afterward). (Poor fits, at best, were expected.) So failing to find the “predicted” phenomenon (the Einstein deflection effect) would not falsify GTR. There were too many background explanations for observed anomalies (Duhem’s problem). This is so even though observing a deflection effect does count! (this is one of my main shifts on Popper–or rather, I think Popperians make a mistake when they say otherwise.)
Of course, even when they observed a “deflection effect”—an apparently positive result—a lot of work was required to rule out any number of other explanations for the “positive” result (if interested, see refs[2]). Nor is there anything “unPopperian” about the fact that no eclipse result would have refuted GTR (certainly not in 1919). (Paul Meehl and other Popperians are wrong about this.) Admittedly Popper was not clear enough on this issue. Nevertheless, and this is my main point today, he was right to distinguish the GTR testing case from the “testing” of the popular theories he describes, wherein any data could be interpreted in light of the theory. My reading (or improvement?) on Popper, so far as this point goes, then, is that he is demarcating those empirical assessments or tests of a claim that are “scientific” (probative) and those that are “pseudoscientific” (insevere or questionable). To claim positive evidence for H from test T requires (minimally) indicating outcomes that would have been construed as evidence against or as counterinstances to H. The onus is on testers or interpreters of data to show how the charge of questionable science has been avoided.
*For the revised Syllabus see: Second Installment. For Day #3 and #5 see: Day #3: and Day #5:
[1] The verificationist’s view of meaning: the meaning of a proposition is its method of verification. Popper contrasts his problem of demarcating science and non-science from this question of “meaning”. Were the verificationist’s account of meaning used as a principle of “demarcation” it would be both too narrow and too wide. (see Popper).
[2] For discussion of background theories in the early eclipse tests, see EGEK chapter 8:
For more contemporary experiments, see my discussion in Error and Inference.
Problem of Induction & some Notes on Popper
Thanks for continuing to post these slides.
I think your (essentially complete) shift, relative to Popper, from the riskiness of H to the severity of the tests of H is an important move.
My (amateur) reading of Popper is that he generally equated the “content” of a theory (which he argued is greater the more risky or less probable it is) and its “testability” and generally gave more details on the former than the latter. This is despite the fact that in many cases he seemed to emphasise the later as the fundamental principle and the former more as an equivalent way of elucidating his ideas.
Assuming this equivalence and focusing on content rather than testability seems to get in the way of moving from the degree of testability, i.e. content, of a hypothesis in principle, to the degree to which a hypothesis has actually been tested by particular tests, i.e. measures of the severity of the tests of the hypothesis. This latter concept, I take you as arguing, being crucial for justifying an inductive step that Popper refused.
The value of moving from properties of hypotheses to properties of tests of hypotheses is a point that also doesn’t seem to be acknowledged by many Bayesians – a popular argument being that what scientists are “really” interested in is very probable hypotheses.
Personally, I prefer the goal of hypotheses which are justified by reliable methods (though I’m open minded about the possible use of Bayesian-like methods/algorithms for e.g. model construction vs model testing, and maybe even alternative formulations of severity). I remember being at first suprised, then convinced, by the move from properties of hypotheses to properties of tests of hypotheses when reading Peirce – before I actually (tried to) read him I had thought he was a Probabilist/Bayesian!
Omaclaren: Thanks for your interesting comment; I concur with nearly all of it, and I’m grateful to hear of people (outside the seminar) looking at the slides.
I think it’s true, what I’m hearing lately, that people have tended to tweet rather than comment on blogs. Even I started that bad habit.
I’d like to get some seminar participants to comment on the blog at least a couple of times a month–oh, OK, extra credit given!
Twitter can be good, in small doses, & lots of interesting people use it, but it’s not well suited to long rambling comments like mine above – and that’s my 140 characters used up.
P.S. it would be great to read comments from seminar participants, if they’re willing (or sufficiently incentivised) to post them here!
I agree, and I hope they do post!
I find it interesting that statisticians often cite Popper and the importance of falsificationist reasoning to them. Oddly, some of these same statisticians purport not to see the relevance, to them, of the issues surrounding tests such as GTR (which Popper contrasts to particular examples from political and social science).
In addition to revealing the nature of the “statistical falsification” they use, I think the points raised in my Popper blogpost would give them a clearer idea of how to articulate the vague notions of “questionable science”, “unreplicable results”, and “pseudoscience”–all returning as hot topics today.
Mayo:
As a statistician who cites Popper, I agree with you that these issues are relevant. I think you’re just underestimating the difficulty we have in reading philosophy papers. It’s a difference in language. Similarly, it seems to me that many philosophers have a horribly naive view of statistics, probably because reading chapter 1 of Bayesian Data Analysis is as difficult for them as reading philosophy papers is for me!
Indeed, when writing my papers on the philosophy of Bayesian statistics, I enlisted Shalizi as a coauthor precisely because he could communicate in both languages, statistics and philosophy. I’d read nothing newer than Lakatos and I thought it important to connect to the more recent philosophy literature.
Andrew: Thanks much for your comment. That was precisely the point of my post(s) on Popper (including some undergrad material recently)—what’s not clear? I’m serious. I do read both literatures on methodology, and your papers in particular, and I had you and other statistical practitioners in mind. So why the gap? I teach students from econ*, stat, CS,forestry, many other fields as well as, of course, philo. The non-philo students wrote excellent essays on Popper last week. So maybe I should post my mini-essay (3 pages) assignment on Popper, if it’s not already up.
How does one raise the level of (philstat) discussion across disciplines?
*When I was 50% in econ for 4 (recent) years and taught “philo of Sci and Econ Methodology,” the econ students felts especially empowered at having actually read Popper, Lakatos and others, rather than just hearing popular econ takes on these guys.