B. Haig on questionable editorial directives from Psychological Science (Guest Post)

.

Brian Haig, Professor Emeritus
Department of Psychology
University of Canterbury
Christchurch, New Zealand

 

What do editors of psychology journals think about tests of statistical significance? Questionable editorial directives from Psychological Science

Deborah Mayo’s (2021) recent editorial in Conservation Biology addresses the important issue of how journal editors should deal with strong disagreements about tests of statistical significance (ToSS). Her commentary speaks to applied fields, such as conservation science, but it is relevant to basic research, as well as other sciences, such as psychology. In this short guest commentary, I briefly remark on the role played by the prominent journal, Psychological Science (PS), regarding whether or not researchers should employ ToSS. PS is the flagship journal of the Association for Psychological Science, and two of its editors-in-chief have offered explicit, but questionable, advice on this matter.

In the May 2005 issue of PS, the experimental psychologist, Peter Killeen (2005), published an article on a new statistic, that he maintained overcame some important deficiencies of null hypothesis significance testing. The alternative statistic, ‘prep’, he understood as the probability of replicating an experimental effect. In the same issue of PS, the editor-in chief, James Cutting, opined that Killeen’s article “may change how all psychologists report their statistics”, and he promptly informed prospective contributors to use prep rather than p values when analysing their data. Within a few years, a majority of empirical articles published in PS employed prep, along with effect sizes. This quick rise to local prominence of prep was immediately followed by the publication of a number of articles that were highly critical of the statistic. Among other things, Killeen’s article was criticized for containing mathematical errors, and for not actually being a replication probability.

Significantly, none of the articles critical of prep were published in PS, despite the fact that the journal decided at the time to devote more space to commentaries. One might reasonably fault Cutting’s editorial decision to accord prep favored status before statisticians and research methodologists had time to evaluate its soundness. To this end, he might have used PS as a forum for scrutiny of Killeen’s article. After a few years, and in the face of strong criticism, PS quietly dropped its recommendation that researchers use prep.

In 2014, the first issue of PS contained a tutorial article by Geoff Cumming (2014) on the “new statistics” that was commissioned by the incoming editor-in chief, Erich Eich. In his accompanying editorial (Eich, 2014) explicitly discouraged prospective authors from using null hypothesis significance testing, and invited them to consider using the new statistics of effect sizes, estimation, and meta-analysis. Cumming, now with Bob Calin-Jageman, continues to assiduously promote the new statistics in the form of textbooks, articles, workshops, symposia, tutorials, and a dedicated website. It is fair to say that the new statistics has become the quasi-official position of the Association for Psychological Science, and that PS continues to play a role in the uptake of the new statistics (Giofrè, et al., 2017).

To my knowledge, PS has published no major critical evaluations of the new statistics, nor presented alternatives to them for consideration. In keeping with this uncritical, one-sided attitude, the major proponents of the new statistics have been reluctant to engage with published criticisms of their position. However, a strong methodological pluralism is required for the advancement of knowledge. In particular, the regular critical interplay of alternative perspectives on ToSS is crucial for their ongoing development and understanding. By promoting two questionable alternatives to ToSS (prep and the new statistics), and shunning well-founded alternatives to them (notably, the error-statistical and Bayesian perspectives; see Mayo, 2018; Haig, 2020), the attitudes to ToSS highlighted here can fairly be interpreted as forms of editorial negligence. Although journal editors cannot be expected to solve major statistical controversies, the directives they issue to prospective authors about statistical practice should be properly informed by relevant debates in the statistics wars.

See the previous commentary by Daniel Lakens.

References

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7-29.

Eich, E. (2014). Business not as usual. Psychological Science, 25, 3-6.

Giofrè, D., et al. (2017). The influence of journal submission guidelines on authors’ reporting of statistics and use of open research practices. PLoS ONE 12 (4): e0175583. https://doi.org/10.1731/joirnal.pone. 0175583

Haig, B. D. (2020). What can psychology’s statistics reformers learn from the error-statistical perspective. Methods in Psychology. https://doi.org/10.1016/j.metip.2020.100020

Killeen, P. R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16, 345-352.

Mayo, D. G. (2018). Statistical inference as severe testing: How to get beyond the statistics wars. Cambridge University Press.

Mayo, D. G. (2021). The statistics wars and intellectual conflicts of interest. Conservation Biology DOI: 10.1111/cobi.13861

All commentaries on Mayo (2021) editorial until Jan 31, 2022 (more to come*)

Schachtman
Park
Dennis
Stark
Staley
Pawitan
Hennig
Ionides and Ritov
Haig
Lakens

*Let me know if you wish to write one

 

 

 

 

 

Categories: ASA Task Force on Significance and Replicability, Brian Haig, editors, significance tests | Tags: | 2 Comments

Post navigation

2 thoughts on “B. Haig on questionable editorial directives from Psychological Science (Guest Post)

  1. Brian:
    I’m very grateful to you for your illuminating commentary and honest assessment of the past and present tendency to foist a favored statistical account on a field. I had heard little about the p-rep fiasco, but it’s good that, as you say, critical appraisal was brought in. I had wanted to add to my editorial that another sign of going beyond merely recommending good practice is if there is selective criticism. One would think these alternatives somehow escape falling into misuse and fallacy. Do any journals give the kind of detailed, and often harsh, rules for avoiding fallacies in using other methods? Do any of the “new statistics” journals even warn against interpreting a confidence level as a probability assignment to a particular estimate? I don’t know, but have never seen it.

    I totally agree with your concern:
    “To my knowledge, PS has published no major critical evaluations of the new statistics, nor presented alternatives to them for consideration. In keeping with this uncritical, one-sided attitude, the major proponents of the new statistics have been reluctant to engage with published criticisms of their position. However, a strong methodological pluralism is required for the advancement of knowledge.”

  2. Pingback: Paul Daniell & Yu-li Ko commentaries on Mayo’s ConBio Editorial | Error Statistics Philosophy

Leave a Reply to Mayo Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.