Revisiting Popper’s Demarcation of Science 2017

28 July 1902- 17 Sept. 1994

Karl Popper died on September 17 1994. One thing that gets revived in my new book (Statistical Inference as Severe Testing, 2018, CUP) is a Popperian demarcation of science vs pseudoscience Here’s a snippet from what I call a “live exhibit” (where the reader experiments with a subject) toward the end of a chapter on Popper:

Live Exhibit. Revisiting Popper’s Demarcation of Science: Here’s an experiment: Try shifting what Popper says about theories to a related claim about inquiries to find something out. To see what I have in mind, join me in watching a skit over the lunch break:

Physicist: “If mere logical falsifiability suffices for a theory to be scientific, then, we can’t properly oust astrology from the scientific pantheon. Plenty of nutty theories have been falsified, so by definition they’re scientific. Moreover, scientists aren’t always looking to subject well corroborated theories to “grave risk” of falsification.”

Fellow traveler: “I’ve been thinking about this. On your first point, Popper confuses things by making it sound as if he’s asking: When is a theory unscientific? What he is actually asking or should be asking is: When is an inquiry into a theory, or an appraisal of claim H unscientific? We want to distinguish meritorious modes of inquiry from those that are BENT. If the test methods enable ad hoc maneuvering, sneaky face-saving devices, then the inquiry–the handling and use of data–is unscientific. Despite being logically falsifiable, theories can be rendered immune from falsification by means of cavalier methods for their testing. Adhering to a falsified theory no matter what is poor science. On the other hand, some areas have so much noise that you can’t pinpoint what’s to blame for failed predictions. This is another way that inquiries become bad science.”

She continues:

“On your second point, it’s true that Popper talked of wanting to subject theories to grave risk of falsification. I suggest that it’s really our inquiries into, or tests of, the theories that we want to subject to grave risk. The onus is on interpreters of data to show how they are countering the charge of a poorly run test. I admit this is a modification of Popper. One could reframe the entire problem as one of the character of the inquiry or test.

In the context of trying to find something out, in addition to blocking inferences that fail the minimal requirement for severity[1]:

A scientific inquiry or test: must be able to embark on a reliable inquiry to pinpoint blame for anomalies (and use the results to replace falsified claims and build a repertoire of errors).

The parenthetical remark isn’t absolutely required, but is a feature that greatly strengthens scientific credentials. Without solving, not merely embarking on, some Duhemian problems there are no interesting falsifications. The ability or inability to pin down the source of failed replications–a familiar occupation these days–speaks to the scientific credentials of an inquiry. At any given time, there are anomalies whose sources haven’t been traced–unsolved Duhemian problems–generally at “higher” levels of the theory-data array. Embarking on solving these is the impetus for new conjectures. Checking test assumptions is part of working through the Duhemian maze. The reliability requirement is given by inferring claims just to the extent that they pass severe tests. There’s no sharp line for demarcation, but when these requirements are absent, an inquiry veers into the realm of questionable science or pseudo science. Some physicists worry that highly theoretical realms can’t be expected to be constrained by empirical data. Theoretical constraints are also important.

[1] Before claiming to have evidence for claim C, something must have been done to have found flaws in C, were C false. If a method is incapable of finding flaws with C, then finding none is poor grounds for inferring they are absent.

Mayo, D. 2018. Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (Cambridge)

Categories: Error Statistics, Popper, pseudoscience, science vs pseudoscience | Tags:

Post navigation

10 thoughts on “Revisiting Popper’s Demarcation of Science 2017

  1. john byrd

    I think Polya had a better way with words than Popper, but expressed the same general thoughts, as in this 1954 quote:
    “Both the common man and the scientist are led to conjectures by a few observations and they are both paying attention to later cases which could be in agreement or not with the conjecture. A case in agreement makes the conjecture more likely, a conflicting one disproves it, and here the difference begins: Ordinary people are usually more apt to look for the first kind of cases, but the scientist looks for the second kind… Mr. Anybody does not like to confess, even to himself, that he was mistaken and so he does not like conflicting cases, he avoids them, he is even inclined to explain them away when they present themselves… The scientist, seeking a definitive decision, looks for cases which have a chance to upset the conjecture, and the more chance they have, the more they are welcome. There is an important point to observe. if a case which threatens to upset the conjecture turns out, after all, to be in agreement with it, the conjecture emerges greatly strengthened from the test. The more danger, the more honor; passing the most threatening examination grants the highest recognition, the strongest experimental evidence to the conjecture. There are instances and instances, verifications and verifications. An instance which is more likely to be conflicting brings the conjecture in any case nearer to decision than an instance which is less so, and this explains the preference of the scientist.”

    We dont discuss Polya (that I recall) but he expressed the severity idea more clearly than Popper, in my opinion. Polya was a very clear writer.

    • Hi John: As you know, I don’t think Popper ever did express the severity idea adequately. For him it was theoretical novelty. x passes H severely if H entails x and no rival theory already explained x. Now the Polya passage makes him sound like a verificationist. In any event the focus here was on characterizing demarcation. Maybe your point is that the scientist welcomes refutations, particularly in cases where agreement would be expected iff the claims were true, and that’s what makes the difference. But that alone won’t get you the constrained pinpointing of blame, the avoidance of flexible assignments. And Popper never could solve Duhem, so my proposal goes beyond Popper, but utilizes my interpretation of Kuhn. It’s in chapter 2 of EGEK by the way. It’s a bit too late to be writing on the blog–but noticed comments.

      • john byrd

        Looking at Polya’s examples I would not label him a verificationist. He does not use the word with a thick meaning. He is saying it is the preference of the scientists that marks the demarcation. He spent some effort to characterize the nature of that preference.

  2. Because its hard to respond to everything in such a dense post, I am going to stick with one bit.

    I see this as problematic from a perspective of a former astronomer: “On the other hand, some areas have so much noise that you can’t pinpoint what’s to blame for failed predictions. This is another way that inquiries become bad science.” How do you determine a project is ‘bad science’ when it is hunting for rare events deep in systemic and statistical noise? Some projects have spent decades searching with no success while others just hit the jackpot. Can we learn anything about how to distinguish a floundering project from one that will eventually succeed based on the history of long term projects?

    And its important to minimize the impact of survivor bias in the comparative analysis.

  3. west: Notice I said it only needed to be able to embark on a probe of anomalies. In so doing you need to deny you’ve succeeded, in the cases as you describe. That is, the fact that you know you haven’t got there, and have strictures that wouldn’t allow you to pretend otherwise–that’s to satisfy my requirement.
    What are your failed predictions? Merely having unexplained phenomena certainly doesn’t make it lose scientific credentials.
    This comes from my interpretation of Tom Kuhn’s perceived opposition with Popper. His example was astrology: there are too many features (e.g., exact time of birth, star positions) that might explain failed predictions. It’s not that astrology is falsified, according to this reading of Kuhn, it’s that it’s not capable of correctly attributing the source of failure. Each of the different astrologers will pinpoint something to explain away any failure, based on their preferred astrological theory, but it’s not constrained. He contrasts this to astronomy.

  4. In relation to westbynoreaster, maybe it’s clearer to say rather than,
    “On the other hand, some areas have so much noise that you can’t pinpoint what’s to blame for failed predictions. This is another way that inquiries become bad science.”

    “On the other hand, some areas have so much noise and/or flexibility that they can’t or won’t distinguish warranted from unwarranted explanations of failed predictions. This is another way that inquiries become bad science.”

    Of course that already fails the severity requirement.

    • Yeah, the statement still doesn’t sit right with me.

      25 years ago, some of the first papers on detection algorithms for LIGO data were published. Radiation from these kinds of sources had been predict back in the 50s. The entire community had to suffer through a serious false detection controversy in the early 70s and then endless papers of ‘upper limits’ rather than actually finding an event. That is until two years ago.

      But from an outside perspective, could one say five years ago that the search for gravitational waves was any different than a WIMP search promising greatness just once the next series of detector upgrades was complete? After so many years of “these signals are very rare and very quiet,” why was the NSF sticking with the project beyond sunk cost and how is that different than other unsuccessful projects like proton-decay?

      • westbynoreaster: First of all, answers to specific cases turn on having the specific background knowledge. My thought is, yes, you can tell. I’m not a physicist, but in the case of gravitational waves there were indirect indications, and my understanding (from Clifford Will) is that all viable relativistic theories of gravity predicted them (and some types were ruled out). So there was a strong theoretical basis and presumably, the unclean background was cleaned up in the latest interferometers. I don’t know about WIMPs, but null results, I have to emphasize, count as successful inquiries into pinpointing sources. Not only do they learn a lot from them (in good sciences) but it’s quite different from pursuits whose inquiry couldn’t even come up with a clear, informative, null result. Again, I’m not up on the physics here.

  5. I think Peirce would say that any struggle to pass from the irritation of doubt toward the settlement of belief is a form of inquiry — it’s just that some forms are more successful than others over the long haul.
    https://inquiryintoinquiry.com/2013/01/15/tenacity-authority-plausibility-inquiry/

Blog at WordPress.com.