severity and deep learning models

Severe testing of deep learning models of cognition (ii)

Posted on January 29, 2026 by Mayo

From time to time I hear of an application of the severe testing philosophy in intriguing ways in fields I know very little about. An example is a recent article by cognitive psychologist Jeffrey Bowers and colleagues (2023): “On the importance of severely testing deep learning models of cognition” (abstract below). Because deep neural networks (DNNs)–advanced machine learning models–seem to recognize images of objects at a similar or even better rate than humans, many researchers suppose DNNs learn to recognize objects in a way similar to humans. However, Bowers and colleagues argue that, on closer inspection, the evidence is remarkably weak, and “in order to address this problem, we argue that the philosophy of severe testing is needed”.

The problem is this. Deep learning models, after all, consist of millions of (largely uninterpretable) parameters. Without understanding how the black box model moves from inputs to outputs, it’s easy to see why observed correlations can easily occur even where the DNN output is due to a variety of factors other than using a similar mechanism as the human visual system. From the standpoint of severe testing, this is a familiar mistake. For data to provide evidence for a claim, it does not suffice that the claim agrees with data, the method must have been capable of revealing the claim to be false, (just) if it is. Here the type of claim of interest is that a given algorithmic model uses similar features or mechanisms as humans to categorize images.[1] The problem isn’t the engineering one of getting more accurate algorithmic models, the problem is inferring claim C: DNNs mimic human cognition in some sense (they focus on vision), even though C has not been well probed. Continue reading →

Categories: severity and deep learning models | 5 Comments

severity and deep learning models

Severe testing of deep learning models of cognition (ii)

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.