Excursion 2 Tour II (3rd stop): Falsiﬁcation, Pseudoscience, Induction (2.3)

Posted on October 10, 2018 by Mayo

StatSci/PhilSci Museum

Where you are in the Journey* We’ll move from the philosophical ground ﬂoor to connecting themes from other levels, from Popperian falsiﬁcation to signiﬁcance tests, and from Popper’s demarcation to current-day problems of pseudoscience and irreplication. An excerpt from our Museum Guide gives a broad-brush sketch of the ﬁrst few sections of Tour II:

Karl Popper had a brilliant way to “solve” the problem of induction: Hume was right that enumerative induction is unjustiﬁed, but science is a matter of deductive falsiﬁcation. Science was to be demarcated from pseudoscience according to whether its theories were testable and falsiﬁable. A hypothesis is deemed severely tested if it survives a stringent attempt to falsify it. Popper’s critics denied he could sustain this and still be a deductivist …

Popperian falsiﬁcation is often seen as akin to Fisher’s view that “every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis” (1935a, p. 16). Though scientists often appeal to Popper, some critics of signiﬁcance tests argue that they are used in decidedly non-Popperian ways. Tour II explores this controversy.

While Popper didn’t make good on his most winning slogans, he gives us many seminal launching-oﬀ points for improved accounts of falsiﬁcation, corroboration, science versus pseudoscience, and the role of novel evidence and predesignation. These will let you revisit some thorny issues in today’s statistical crisis in science.

2.3 Popper, Severity, and Methodological Probability

Here’s Popper’s summary (drawing from Popper, Conjectures and Refutations, 1962, p. 53):

[Enumerative] induction … is a It is neither a psychological fact …nor one of scientiﬁc procedure.

The actual procedure of science is to operate with conjectures…

Repeated observation and experiments function in science as tests of our conjectures or hypotheses, i.e., as attempted refutations.

[It is wrongly believed that using the inductive method can] serve as a criterion of demarcation between science and pseudoscience. … None of this is altered in the least if we say that induction makes theories only probable.

There are four key, interrelated themes:

(1) Science and Pseudoscience. Redeﬁning scientiﬁc method gave Popper a new basis for demarcating genuine science from questionable science or pseudoscience. Flexible theories that are easy to conﬁrm – theories of Marx, Freud, and Adler were his exemplars – where you open your eyes and ﬁnd conﬁrmations everywhere, are low on the scientiﬁc totem pole (ibid., p. 35). For a theory to be scientiﬁc it must be testable and falsiﬁable.

(2) Conjecture and Refutation. The problem of induction is a problem only if it depends on an unjustiﬁable procedure such as enumerative induction. Popper shocked everyone by denying scientists were in the habit of inductively enumerating. It doesn’t even hold up on logical grounds. To talk of “another instance of an A that is a B” assumes a conceptual classiﬁcation scheme. How else do we recognize it as another item under the umbrellas A and B? (ibid., p. 44). You can’t just observe, you need an interest, a point of view, a problem.

The actual procedure for learning in science is to operate with conjectures in which we then try to ﬁnd weak spots and ﬂaws. Deductive logic is needed to draw out the remote logical consequences that we actually have a shot at testing (ibid., p. 51). From the scientist down to the amoeba, says Popper, we learn by trial and error: conjecture and refutation (ibid., p. 52). The crucial diﬀerence is the extent to which we constructively learn how to reorient ourselves after clashes.

Without waiting, passively, for repetitions to impress or impose regularities upon us, we actively try to impose regularities upon the world… These may have to be discarded later, should observation show that they are wrong. (ibid., p. 46)

(3) Observations Are Not Given. Popper rejected the time-honored empiricist assumption that observations are known relatively unproblematically. If they are at the “foundation,” it is only because there are apt methods for testing their validity. We dub claims observable because or to the extent that they are open to stringent checks. (Popper: “anyone who has learned the relevant technique can test it” (1959, p. 99).) Accounts of hypothesis appraisal that start with “evidence x,” as in conﬁrmation logics, vastly oversimplify the role of data in learning.

(4) Corroboration Not Conﬁrmation, Severity Not Probabilism. Last, there is his radical view on the role of probability in scientiﬁc inference. Rejecting probabilism, Popper not only rejects Carnap-style logics of conﬁrmation, he denies scientists are interested in highly probable hypotheses (in any sense). They seek bold, informative, interesting conjectures and ingenious and severe attempts to refute them. If one uses a logical notion of probability, as philosophers (including Popper) did at the time, the high content theories are highly improbable; in fact, Popper said universal theories have 0 probability. (Popper also talked of statistical probabilities as propensities.)

These themes are in the spirit of the error statistician. Considerable spade-work is required to see what to keep and what to revise, so bring along your archeological shovels.

Demarcation and Investigating Bad Science

There is a reason that statisticians and scientists often refer back to Popper; his basic philosophy – at least his most winning slogans – are in sync with ordinary intuitions about good scientiﬁc practice. Even people divorced from Popper’s full philosophy wind up going back to him when they need to demarcate science from pseudoscience. Popper’s right that if using enumerative induction makes you scientiﬁc then anyone from an astrologer to one who blithely moves from observed associations to full blown theories is scientiﬁc. Yet the criterion of testability and falsiﬁability – as typically understood – is nearly as bad. It is both too strong and too weak. Any crazy theory found false would be scientiﬁc, and our most impressive theories aren’t deductively falsiﬁable. Larry Laudan’s famous (1983) “The Demise of the Demarcation Problem” declared the problem taboo. This is a highly unsatisfactory situation for philosophers of science. Now Laudan and I generally see eye to eye, perhaps our disagreement here is just semantics. I share his view that what really matters is determining if a hypothesis is warranted or not, rather than whether the theory is “scientiﬁc,” but surely Popper didn’t mean logical falsiﬁability suﬃced. Popper is clear that many unscientiﬁc theories (e.g., Marxism, astrology) are falsiﬁable. It’s clinging to falsiﬁed theories that leads to unscientiﬁc practices. (Note: The use of a strictly falsiﬁed theory for prediction, or because nothing better is available, isn’t unscientiﬁc.) I say that, with a bit of ﬁne-tuning, we can retain the essence of Popper to capture what makes an inquiry, if not an entire domain, scientiﬁc.

Following Laudan, philosophers tend to shy away from saying anything general about science versus pseudoscience – the predominant view is that there is no such thing. Some say that there’s at most a kind of “family resemblance” amongst domains people tend to consider scientiﬁc (Dupré 1993, Pigliucci 2010, 2013). One gets the impression that the demarcation task is being left to committees investigating allegations of poor science or fraud. They are forced to articulate what to count as fraud, as bad statistics, or as mere questionable research practices (QRPs). People’s careers depend on their ruling: they have “skin in the game,” as Nassim Nicholas Taleb might say (2018). The best one I know – the committee investigating fraudster Diederik Stapel – advises making philosophy of science a requirement for researchers (Levelt Committee, Noort Committee, and Drenth Committee 2012). So let’s not tell them philosophers haven given up on it.

Diederik Stapel. A prominent social psychologist “was found to have committed a serious infringement of scientiﬁc integrity by using ﬁctitious data in his publications” (Levelt Committee 2012, p. 7). He was required to retract 58 papers, relinquish his university degree and much else. The authors of the report describe walking into a culture of conﬁrmation and veriﬁcation bias. They could scarcely believe their ears when people they interviewed “defended the serious and less serious violations of proper scientiﬁc method with the words: that is what I have learned in practice; everyone in my research environment does the same, and so does everyone we talk to at international conferences” (ibid., p. 48). Free of the qualms that give philosophers of science cold feet, they advance some obvious yet crucially important rules with Popperian echoes:

One of the most fundamental rules of scientiﬁc research is that an investigation must be designed in such a way that facts that might refute the research hypotheses are given at least an equal chance of emerging as do facts that conﬁrm the research hypotheses. Violations of this fundamental rule, such as continuing to repeat an experiment until it works as desired, or excluding unwelcome experimental subjects or results, inevitably tend to conﬁrm the researcher’s research hypotheses, and essentially render the hypotheses immune to the facts. (ibid., p. 48)

Exactly! This is our minimal requirement for evidence: If it’s so easy to ﬁnd agreement with a pet theory or claim, such agreement is bad evidence, no test, BENT. To scrutinize the scientiﬁc credentials of an inquiry is to determine if there was a serious attempt to detect and report errors and biasing selection eﬀects. We’ll meet Stapel again when we reach the temporary installation on the upper level: The Replication Revolution in Psychology.

The issue of demarcation (point (1)) is closely related to Popper’s conjecture and refutation (point (2)). While he regards a degree of dogmatism to be necessary before giving theories up too readily, the trial and error methodology “gives us a chance to survive the elimination of an inadequate hypothesis – when a more dogmatic attitude would eliminate it by eliminating us” (Popper 1962, p. 52). Despite giving lip service to testing and falsiﬁcation, many popular accounts of statistical inference do not embody falsiﬁcation – even of a statistical sort.

Nearly everyone, however, now accepts point (3), that observations are not just “given”– knocking out a crucial pillar on which naïve empiricism stood. To the question: What came ﬁrst, hypothesis or observation? Popper answers, another hypothesis, only lower down or more local. Do we get an inﬁnite regress? No, because we may go back to increasingly primitive theories and even, Popper thinks, to an inborn propensity to search for and ﬁnd regularities (ibid., p. 47). I’ve read about studies appearing to show that babies are aware of what is statistically unusual. In one, babies were shown a box with a large majority of red versus white balls (Xu and Garcia 2008, Gopnik 2009). When a succession of white balls are drawn, one after another, with the contents of the box covered with a screen, the babies looked longer than when the more common red balls were drawn. I don’t ﬁnd this far-fetched. Anyone familiar with preschool computer games knows how far toddlers can get in solving problems without a single word, just by trial and error.

Greater Content, Greater Severity. The position people are most likely to take a pass on is (4), his view of the role of probability. Yet Popper’s central intuition is correct: if we wanted highly probable claims, scientists would stick to low-level observables and not seek generalizations, much less theories with high explanatory content. In this day of fascination with Big Data’s ability to predict what book I’ll buy next, a healthy Popperian reminder is due: humans also want to understand and to explain. We want bold “improbable” theories. I’m a little puzzled when I hear leading machine learners praise Popper, a realist, while proclaiming themselves fervid instrumentalists. That is, they hold the view that theories, rather than aiming at truth, are just instruments for organizing and predicting observable facts. It follows from the success of machine learning, Vladimir Cherkassky avers, that “realism is not possible.” This is very quick philosophy! “.. . [I]n machine learning we are given a set of [random] data samples, and the goal is to select the best model (function, hypothesis) from a given set of possible models” (Cherkassky 2012). Fine, but is the background knowledge required for this setup itself reducible to a prediction–classiﬁcation problem? I say no, as would Popper. Even if Cherkassky’s problem is relatively theory free, it wouldn’t follow this is true for all of science. Some of the most impressive “deep learning” results in AI have been criticized for lacking the ability to generalize beyond observed “training” samples, or to solve open-ended problems (Gary Marcus 2018).

A valuable idea to take from Popper is that probability in learning attaches to a method of conjecture and refutation, that is to testing: it is methodological probability. An error probability is a special case of a methodological probability. We want methods with a high probability of teaching us (and machines) how to distinguish approximately correct and incorrect interpretations of data, even leaving murky cases in the middle, and how to advance knowledge of detectable, while strictly unobservable, eﬀects.

The choices for probability that we are commonly oﬀered are stark: “in here” (beliefs ascertained by introspection) or “out there” (frequencies in long runs, or chance mechanisms). This is the “epistemology” versus “variability” shoe- horn we reject (Souvenir D). To qualify the method by which H was tested, frequentist performance is necessary, but it’s not suﬃcient. The assessment must be relevant to ensuring that claims have been put to severe tests. You can talk of a test having a type of propensity or capability to have discerned ﬂaws, as Popper did at times. A highly explanatory, high-content theory, with inter- connected tentacles, has a higher probability of having ﬂaws discerned than low-content theories that do not rule out as much. Thus, when the bolder, higher content, theory stands up to testing, it may earn higher overall severity than the one with measly content. That a theory is plausible is of little interest, in and of itself; what matters is that it is implausible for it to have passed these tests were it false or incapable of adequately solving its set of problems. It is the fuller, unifying, theory developed in the course of solving interconnected problems that enables severe tests.

Methodological probability is not to quantify my beliefs, but neither is it about a world I came across without considerable eﬀort to beat nature into line. Let alone is it about a world-in-itself which, by deﬁnition, can’t be accessed by us. Deliberate eﬀort and ingenuity are what allow me to ensure I shall come up against a brick wall, and be forced to reorient myself, at least with reasonable probability, when I test a ﬂawed conjecture. The capabilities of my tools to uncover mistaken claims (its error probabilities) are real properties of the tools. Still, they are my tools, specially and imaginatively designed. If people say they’ve made so many judgment calls in building the inferential apparatus that what’s learned cannot be objective, I suggest they go back and work some more at their experimental design, or develop better theories.

Falsiﬁcation Is Rarely Deductive. It is rare for any interesting scientiﬁc hypotheses to be logically falsiﬁable. This might seem surprising given all the applause heaped on falsiﬁability. For a scientiﬁc hypothesis H to be deductively falsiﬁed, it would be required that some observable result taken together with H yields a logical contradiction (A & ~A). But the only theories that deductively prohibit observations are of the sort one mainly ﬁnds in philosophy books: All swans are white is falsiﬁed by a single non-white swan. There are some statistical claims and contexts, I will argue, where it’s possible to achieve or come close to deductive falsiﬁcation: claims such as, these data are independent and identically distributed (IID). Going beyond a mere denial to replacing them requires more work.

However, interesting claims about mechanisms and causal generalizations require numerous assumptions (substantive and statistical) and are rarely open to deductive falsiﬁcation. How then can good science be all about falsiﬁability? The answer is that we can erect reliable rules for falsifying claims with severity. We corroborate their denials. If your statistical account denies we can reliably falsify interesting theories, it is irrelevant to real-world knowledge. Let me draw your attention to an exhibit on a strange disease, kuru, and how it falsiﬁed a fundamental dogma of biology.

Exhibit (v): Kuru. Kuru (which means “shaking”) was widespread among the Fore people of New Guinea in the 1960s. In around 3–6 months, Kuru victims go from having diﬃculty walking, to outbursts of laughter, to inability to swallow and death. Kuru, and (what we now know to be) related diseases, e.g., mad cow, Creutzfeldt–Jakob, and scrapie, are “spongiform” diseases, causing brains to appear spongy. Kuru clustered in families, in particular among Fore women and their children, or elderly parents. They began to suspect transmission was through mortuary cannibalism. Consuming the brains of loved ones, a way of honoring the dead, was also a main source of meat permitted to women. Some say men got ﬁrst dibs on the muscle; others deny men partook in these funerary practices. What we know is that ending these cannibalistic practices all but eradicated the disease. No one expected at the time that understanding kuru’s cause would falsify an established theory that only viruses and bacteria could be infectious. This “central dogma of biology” says:

H: All infectious agents have nucleic acid.

Any infectious agent free of nucleic acid would be anomalous for H – meaning it goes against what H claims. A separate step is required to decide when H’s anomalies should count as falsifying H. There needn’t be a cut-oﬀ so much as a standpoint as to when continuing to defend H becomes bad science. Prion researchers weren’t looking to test the central dogma of biology, but to understand kuru and related diseases. The anomaly erupted only because kuru appeared to be transmitted by a protein alone, by changing a normal protein shape into an abnormal fold. Stanley Prusiner called the infectious protein a prion – for which he received much grief. He thought, at ﬁrst, he’d made a mistake “and was puzzled when the data kept telling me that our preparations contained protein but not nucleic acid” (Prusiner 1997). The anomalous results would not go away and, eventually, were demonstrated via experimental transmission to animals. The discovery of prions led to a “revolution” in molecular biology, and Prusiner received a Nobel prize in 1997. It is logically possible that nucleic acid is somehow involved. But continuing to block the falsiﬁcation of H (i.e., block the “protein only” hypothesis) precludes learning more about prion diseases, which now include Alzheimer’s. (See Mayo 2014a.)

Insofar as we falsify general scientiﬁc claims, we are all methodological falsiﬁcationists. Some people say, “I know my models are false, so I’m done with the job of falsifying before I even begin.” Really? That’s not falsifying. Let’s look at your method: always infer that H is false, fails to solve its intended problem. Then you’re bound to infer this even when this is erroneous. Your method fails the minimal severity requirement.

Do Probabilists Falsify? It isn’t obvious a probabilist desires to falsify, rather than supply a probability measure indicating disconﬁrmation, the opposite of a B-boost (a B-bust?), or a low posterior. Members of some probabilist tribes propose that Popper is subsumed under a Bayesian account by taking a low value of Pr(x|H) to falsify H. That could not work. Individual outcomes described in detail will easily have very small probabilities under H without being genuine anomalies for H. To the severe tester, this as an attempt to distract from the inability of probabilists to falsify, insofar as they remain probabilists. What about comparative accounts (Likelihoodists or Bayes factor accounts), which I also place under probabilism? Reporting that one hypothesis is more likely than the other is not to falsify anything. Royall is clear that it’s wrong to even take the comparative report as evidence against one of the two hypotheses: they are not exhaustive. (Nothing turns on whether you prefer to put Likelihoodism under its own category.) Must all such accounts abandon the ability to falsify? No, they can indirectly falsify hypotheses by adding a methodological falsiﬁcation rule. A natural candidate is to falsify H if its posterior probability is suﬃciently low (or, perhaps, suﬃciently disconﬁrmed). Of course, they’d need to justify the rule, ensuring it wasn’t often mistaken.

The Popperian (Methodological) Falsiﬁcationist Is an Error Statistician

When is a statistical hypothesis to count as falsiﬁed? Although extremely rare events may occur, Popper notes:

such occurrences would not be physical eﬀects, because, on account of their immense improbability, they are not reproducible at will … If, however, we ﬁnd reproducible deviations from a macro eﬀect .. . deduced from a probability estimate … then we must assume that the probability estimate is falsiﬁed. (Popper 1959, p. 203)

In the same vein, we heard Fisher deny that an “isolated record” of statistically signiﬁcant results suﬃces to warrant a reproducible or genuine eﬀect (Fisher 1935a, p. 14). Early on, Popper (1959) bases his statistical falsifying rules on Fisher, though citations are rare. Even where a scientiﬁc hypothesis is thought to be deterministic, inaccuracies and knowledge gaps involve error-laden predictions; so our methodological rules typically involve inferring a statistical hypothesis. Popper calls it a falsifying hypothesis. It’s a hypothesis inferred in order to falsify some other claim. A ﬁrst step is often to infer an anomaly is real, by falsifying a “due to chance” hypothesis.

The recognition that we need methodological rules to warrant falsiﬁcation led Popperian Imre Lakatos to dub Popper’s philosophy “methodological falsiﬁcationism” (Lakatos 1970, p. 106). If you look at this footnote, where Lakatos often buried gems, you read about “the philosophical basis of some of the most interesting developments in modern statistics. The Neyman–Pearson approach rests completely on methodological falsiﬁcationism” (ibid., p. 109, note 6). Still, neither he nor Popper made explicit use of N-P tests. Statistical hypotheses are the perfect tool for “falsifying hypotheses.” However, this means you can’t be a falsiﬁcationist and remain a strict deductivist. When statisticians (e.g., Gelman 2011) claim they are deductivists like Popper, I take it they mean they favor a testing account like Popper, rather than inductively building up probabilities. The falsifying hypotheses that are integral for Popper also necessitate an evidence-transcending (inductive) statistical inference.

This is hugely problematic for Popper because being a strict Popperian means never having to justify a claim as true or a method as reliable. After all, this was part of Popper’s escape from induction. The problem is this: Popper’s account rests on severe tests, tests that would probably falsify claims if false, but he cannot warrant saying a method is probative or severe, because that would mean it was reliable, which makes Popperians squeamish. It would appear to concede to his critics that Popper has a “whiﬀ of induction” after all. But it’s not inductive enumeration. Error statistical methods (whether from statistics or informal) can supply the severe tests Popper sought. This leads us to Pierre Duhem, physicist and philosopher of science.

To read ‘Duhemian Problems of Falsiﬁcation’, and souvenirs E and F, see all of section 2.3.

….

Live Exhibit (vi): Revisiting Popper’s Demarcation of Science. Here’s an experiment: try shifting what Popper says about theories to a related claim about inquiries to ﬁnd something out. To see what I have in mind, let’s listen to an exchange between two fellow travelers over coﬀee at Statbucks.

TRAVELER 1: If mere logical falsiﬁability suﬃces for a theory to be scientiﬁc, then, we can’t properly oust astrology from the scientiﬁc pantheon. Plenty of nutty theories have been falsiﬁed, so by deﬁnition they’re scientiﬁc. Moreover, scientists aren’t always looking to subject well-corroborated theories to “grave risk” of falsiﬁcation.

TRAVELER 2: I’ve been thinking about this. On your ﬁrst point, Popper confuses things by making it sound as if he’s asking: When is a theory unscientiﬁc? What he is actually asking or should be asking is: When is an inquiry into a theory, or an appraisal of claim H, unscientiﬁc? We want to distinguish meritorious modes of inquiry from those that are BENT. If the test methods enable ad hoc maneuvering, sneaky face- saving devices, then the inquiry – the handling and use of data – is unscientiﬁc. Despite being logically falsiﬁable, theories can be rendered immune from falsiﬁcation by means of cavalier methods for their testing. Adhering to a falsiﬁed theory no matter what is poor science. Some areas have so much noise and/or ﬂexibility that they can’t or won’t distinguish warranted from unwarranted explanations of failed predictions. Rivals may ﬁnd ﬂaws in one another’s inquiry or model, but the criticism is not constrained by what’s actually responsible. This is another way inquiries can become unscientiﬁc.¹

She continues:

On your second point, it’s true that Popper talked of wanting to subject theories to grave risk of falsiﬁcation. I suggest that it’s really our inquiries into, or tests of, the theories that we want to subject to grave risk. The onus is on interpreters of data to show how they are countering the charge of a poorly run test. I admit this is a modiﬁcation of Popper. One could reframe the entire demarcation problem as one of the characters of an inquiry or test.

She makes a good point. In addition to blocking inferences that fail the minimal requirement for severity:

A scientiﬁc inquiry or test: must be able to embark on a reliable probe to pinpoint blame for anomalies (and use the results to replace falsiﬁed claims and build a repertoire of errors).

The parenthetical remark isn’t absolutely required, but is a feature that greatly strengthens scientiﬁc credentials. Without solving, not merely embarking on, some Duhemian problems there are no interesting falsiﬁcations. The ability or inability to pin down the source of failed replications – a familiar occupation these days – speaks to the scientiﬁc credentials of an inquiry. At any given time, even in good sciences there are anomalies whose sources haven’t been traced – unsolved Duhemian problems – generally at “higher” levels of the theory-data array. Embarking on solving these is the impetus for new conjectures. Checking test assumptions is part of working through the Duhemian maze. The reliability requirement is: infer claims just to the extent that they pass severe tests. There’s no sharp line for demarcation, but when these requirements are absent, an inquiry veers into the realm of questionable science or pseudoscience. Some physicists worry that highly theoretical realms can’t be expected to be constrained by empirical data. Theoretical constraints are also important. We’ll ﬂesh out these ideas in future tours.

¹ _{For example, astronomy, but not astrology, can reliably solve its Duhemian puzzles. Chapter 2, Mayo (1996), following my reading of Kuhn (1970) on “normal science.”}

^{*Where you are in the Journey: I posted all of Excursion 1 Tour I, here, here, and here, and omitted Tour II (but blogposts on the Law of Likelihood, Royall, optional stopping, and Barnard may be found by searching this blog). Update 6/19: I have since posted all of Excursion 1 Tour II in proofs here. You are now in Excursion 2, the first stop of Tour I (2.1) is here. The main material from 2.2 can be found in this blogpost. You can read the rest of Excursion 2 Tour II section 2.3, in proof form, here. For the full Itinerary of Statistical Inference as Severe Testing: How to Get Beyond the Stat Wars (2018, CUP) SIST Itinerary.}

Categories: Statistical Inference as Severe Testing | 11 Comments

11 thoughts on “Excursion 2 Tour II (3rd stop): Falsiﬁcation, Pseudoscience, Induction (2.3)”

October 11, 2018

Christian Hennig

What I’ve read up to now is excellent but maybe you’re spoiling us with too much material… I find it hard to keep up reading and then maybe on top even finding something at least mildly interesting to say.

The issue I always come back to when thinking about what is discussed here is what is meant by “truth”. I know that later in your book you write about the idea that “all models are wrong”. Surely probability models are not literally true (in the sense that reality works exactly as a formal data generation mechanism specified by the model) but you state that using probability models you want to get at a truth beyond mere prediction success. For me, the fact that models cannot be literally true was always a major issue with Bayesian approaches that assign probabilities to models being in fact true, whereas in a frequentist test what is tested is not the truth of the (null) model but rather just whether the data are compatible with it, without ruling out potential competitors that may explain the data as well (although using severity arguments, specific competitors can be ruled out).
If I ask myself what can be said about truth, I can’t think of anything better and more convincing than to appeal to observables, as problematic as this may be (I don’t disagree with the issues that you have with that). Of course, it is not enough to look at successful prediction of observables in a restricted setup (like the specific problems, training and test samples for which some machine learning algorithms work, which may generalise badly) but ultimately there is no way beyond observing, testing, falsifying, measuring prediction quality, but potentially in more general situations than the ones used by your machine learner… well, you’re saying nothing else really, but then is there much to realism beyond a somewhat broader understood instrumentalism? (I actually also already saw that you want to be agnostic about realism but you seem to be quite critical of instrumentalism… or is it just a too narrowly interpreted form of instrumentalism?)

Personally I’d also value theories and models for their potential to inspire creativity and enable new forms of communication, observation and analysis, but this seems to be rather unrelated to “truth”.

Ultimately I’m not sure whether what I write here reflects in the first place my lack of time to read everything properly, but anyway, just to give something back, here it is.

Reply

October 11, 2018

Mayo

Don’t worry, I won’t be posting more for a while, and of course not to go beyond the teaser for the book. Do you think we shuld try to resume the old U-Phils”? You might do one on the objectivity Tour.

Reply

October 16, 2018

Christian Hennig

I’d be happy to contribute.

Reply

October 22, 2018

Mayo

So send the remarks or queries you suggested you had when you said I was giving too much too fast.

Reply

October 12, 2018

Mayo

Yes, I find instrumentalism easily refuted actually. But there are sophisticated anti-realisms that I can barely distinguish from realisms.My idea is to consider it on a case by case basis: in some sciences at some stages theories are accepted as “real”, others not.

Reply

October 12, 2018

phaneron0

OK – my two cents.

This seems to be the real force of induction “Deliberate eﬀort and ingenuity are what allow me to ensure I shall come up against a brick wall, and be forced to reorient myself, at least with reasonable probability, when I test a ﬂawed conjecture.” or as CS Peirce once argued, the real justification of induction is that even though it will mislead us [give us a false sense of reality that is beyond direct access] if inquiry is adequately persisted in, the false sense will be rescinded.

Christian: That also seems to me to be the force of your comment “value theories and models for their potential to inspire creativity and enable new forms of communication, observation and analysis” in that they increase our chances to become less wrong about reality or at least how we are currently being frustrated when we act.

Mayo: Again my two cents but beyond the teaser for the book perhaps discussions and clarifications of your book that you might encounter from your future readers?

Keith O’Rourke

Reply

October 12, 2018

Mayo

Yes, I plan to have discussions, and I’d be glad for your input on the form they might take. I will teache a seminar on the book in spring, and that will supply resources for parallel discussions on line, but I’m also interested to do something sooner, along the lines of the “U Phils” in the past. Ask people who might want to write a few paragraphs on a particular tour, including just a set of questions. Gelman said some kind of discussion on his blog is to happen. The book was available just 3 weeks ago.
So what do you think?

Reply

October 16, 2018

Christian Hennig

Keith: I don’t like the “less wrong” wording that much for various reasons, one of which the obvious connotation that there is a truth to be had at least in principle. What I like more is what Hasok Chang calls “Active Scientific Realism”. In his words (from Chang, H. Is Water H 2 O? Evidence, Realism and Pluralism. Springer 2012): “I take reality as whatever is not subject to one’s will, and knowledge as an ability to act without being frustrated by resistance from reality. This perspective allows an optimistic rendition of the pessimistic induction, which celebrates the fact that we can be successful in science without even knowing the truth. The standard realist argument from success to truth is shown to be ill-defined and flawed.”

Reply

October 17, 2018

phaneron0

Christian:

> I don’t like the “less wrong”

OK how would you express the increasing of an ability to act without being frustrated by resistance from reality. If we can’t do that – then why undertake inquiry at all?

Now Andrew and I used that Chang quote (which seems very much like some of CS Peirce’s writings) in our amalgamation draft but also added “We add here that we strive for more than just not being frustrated by resistance from reality; rather, we want our findings and claims that aim at truth to be “beliefs which succeed for reasons connected to the way things are” (Misak, 2016).

Now you can’t always get what want, but if try (which all we can do) you might get what you need.

Keith O’Rourke

Reply

October 22, 2018

Christian Hennig

Keith: I think Hasok Chang nailed it already; I don’t think I need a better wording than that.
What shines through in this conversation is my constructivist background; when it’s about “truth” and “how things really are”, I become skeptical and wonder how we ever can know anything better than how to “not be frustrated” in many situations, some of which we bring up systematically in order to test our ideas severely. My suspicion is that “getting at the truth” means nothing more than just that but sounds greater to some people who wouldn’t be happy without some big truths behind what we can observe and experience.

So how can we know we got closer to such a truth other than being more successful in avoiding frustration and adding something to the world that we then experience to “work”?

Reply

Pingback: Mementos for Excursion 2 Tour II: Falsiﬁcation, Pseudoscience, Induction (2.3-2.7) | Error Statistics Philosophy

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension. Cancel reply

Excursion 2 Tour II (3rd stop): Falsiﬁcation, Pseudoscience, Induction (2.3)

2.3 Popper, Severity, and Methodological Probability

Demarcation and Investigating Bad Science

The Popperian (Methodological) Falsiﬁcationist Is an Error Statistician

To read ‘Duhemian Problems of Falsiﬁcation’, and souvenirs E and F, see all of section 2.3.

Post navigation

11 thoughts on “Excursion 2 Tour II (3rd stop): Falsiﬁcation, Pseudoscience, Induction (2.3)”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

Excursion 2 Tour II (3rd stop): Falsiﬁcation, Pseudoscience, Induction (2.3)

2.3 Popper, Severity, and Methodological Probability

Demarcation and Investigating Bad Science

The Popperian (Methodological) Falsiﬁcationist Is an Error Statistician

To read ‘Duhemian Problems of Falsiﬁcation’, and souvenirs E and F, see all of section 2.3.

Related

Post navigation

11 thoughts on “Excursion 2 Tour II (3rd stop): Falsiﬁcation, Pseudoscience, Induction (2.3)”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.