Formaldehyde Hearing: How to Tell the Truth With Statistically Insignificant Results

One of the first examples I came across of problems in construing statistically insignificant (or “negative”) results was a House Science and Technology investigation of an EPA ruling on formaldehyde in the 1980’s. Investigators of the EPA (led by Senator Al Gore!) used rather straightforward, day-to-day reasoning: No evidence of risk is not evidence of no risk. Given the growing interest in science and values both in philosophy and in science and technology studies, I made the “principle” explicit. I thought it was pretty obvious, aside from my Popperian leanings. I’m surprised it’s still an issue.

The case involved the Occupational Safety and Health Administration (OSHA), and possible risks of formaldehyde in the workplace. In 1982, the new EPA assistant administrator, who had come in with Ronald Reagan, “reassessed” the data from the previous administration and, reversing an earlier ruling, announced: “There does not appear to be any relationship, based on the existing data base on humans, between exposure [to formaldehyde] and cancer” (Hearing p. 260).

The trouble was that this assertion was based on epidemiological studies that had little ability to produce a statistically significant result even if there were risks worth worrying about (according to OSHA’s standards of risks of concern, which were not in dispute).[i]The EPA’s assertion that the risks ranged from “0 to no concern” had not passed a very stringent or severe test.

In the spirit of keeping the discussion non-technical, I formulated a rather clunky “metastatistical rule” M (in Mayo 1991):

Rule (M): A statistically insignificant difference is a poor indication that an actual increased risk is less than (some amount) d* if it is very improbable that the test would have resulted in a more statistically significant difference, even if the actual increased risk is as large as d*.

Note: this is akin to saying: a statistically insignificant difference is a poor indication that an actual increased risk is less than (some amount) d*, if the power of the test to detect d* is low. The only difference is that M takes account of the actual insignificant p-value, and so is more informative. [ii]

Little did I know at the time, however (not until I found some papers in my attic a decade later), that Jerzy Neyman had made an analogous point in terms of power when he warned us about this common misinterpretation of non-rejections. To be continued at a later time (“Neyman’s Nursery”).

U.S. Congress. House of Representatives. Committee on Science and Technology. May 20, 1982. Formaldehyde: Review of Scientific Basis of EPA’s Carcinogenic Risk Assessment. Hearing before the Subcommittee on Investigations and Oversight. 97^th Cong., 2d sess.

Mayo, D. 1991. Sociological Versus Metascientific Views of Risk Assessment. In Acceptable Evidence, Science and Values in Risk Assessment, edited by D. Mayo and R. Hollander. Oxford: Oxford University Press.

(For a rather scruffy copy: Sociological Versus Metascientific Views of Risk Assessment)

[i] Animal studies had resulted in statistically significant risks; but the studies on humans did not.

[ii] By contrast, when a result is statistically significant, the worry is that the test may be so sensitive as to be picking up on discrepancies from the null that are not substantively important.

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension. Cancel reply

Formaldehyde Hearing: How to Tell the Truth With Statistically Insignificant Results

Post navigation

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

Formaldehyde Hearing: How to Tell the Truth With Statistically Insignificant Results

Related

Post navigation

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.