Statistical fraudbusting

How to hire a fraudster chauffeur

images-16Would you buy a used car from this man? Probably not, but he thinks you might like to hire him as your chauffeur and brilliant conversationalist. I’m not kidding: fraudster Diederik Stapel is now offering what he calls ‘mind rides’ (see ad below video). He is prepared “to listen to what you have to say or talk to you about what fascinates, surprises or angers you”. He is already giving pedagogical talks on a train. This from Retraction Watch:

Diederik Stapel, the social psychologist who has now retracted 54 papers, recently spoke as part of the TEDx Braintrain, which took place on a trip from Maastricht to Amsterdam. Among other things, he says he lost his moral compass, but that it’s back.

Here’s a  rough translation of the chauffeur ad from Stapel’s website (source is this blog):

Always on the move, from A to B, hurried, no time for reflection, for distance, for perspective. […] Diederik offers himself as your driver and conversation partner who won’t just get you from A to B, but who would also like to add meaning and disruption to your travel time. He will […] listen to what you have to say or talk to you about what fascinates, surprises or angers you. [Slightly paraphrased for brevity—Branko]

I don’t think I’d pay to have a Stapel “disruption” added to my travel time, would you? He sounds so much as he does in “Ontsporing”[i],

[i]The following is from a review of his Ontsporing [“derailed”].

“Ontsporing provides the first glimpses of how, why, and where Stapel began. It details the first small steps that led to Stapel’s deception and highlights the fine line between research fact and fraud:

‘I was alone in my fancy office at University of Groningen.… I opened the file that contained research data I had entered and changed an unexpected 2 into a 4.… I looked at the door. It was closed.… I looked at the matrix with data and clicked my mouse to execute the relevant statistical analyses. When I saw the new results, the world had returned to being logical’. (p. 145) Continue reading

Categories: Statistical fraudbusting | 12 Comments

P-values can’t be trusted except when used to argue that P-values can’t be trusted!

images-1Have you noticed that some of the harshest criticisms of frequentist error-statistical methods these days rest on methods and grounds that the critics themselves purport to reject? Is there a whiff of inconsistency in proclaiming an “anti-hypothesis-testing stance” while in the same breath extolling the uses of statistical significance tests and p-values in mounting criticisms of significance tests and p-values? I was reminded of this in the last two posts (comments) on this blog (here and here) and one from Gelman from a few weeks ago (“Interrogating p-values”).

Gelman quotes from a note he is publishing:

“..there has been a growing sense that psychology, biomedicine, and other fields are being overwhelmed with errors … . In two recent series of papers, Gregory Francis and Uri Simonsohn and collaborators have demonstrated too-good-to-be-true patterns of p-values in published papers, indicating that these results should not be taken at face value.”

But this fraudbusting is based on finding statistically significant differences from null hypotheses (e.g., nulls asserting random assignments of treatments)! If we are to hold small p-values untrustworthy, we would be hard pressed to take them as legitimating these criticisms, especially those of a career-ending sort.

…in addition to the well-known difficulties of interpretation of p-values…,…and to the problem that, even when all comparisons have been openly reported and thus p-values are mathematically correct, the ‘statistical significance filter’ ensures that estimated effects will be in general larger than true effects, with this discrepancy being well over an order of magnitude in settings where the true effects are small… (Gelman 2013)

But surely anyone who believed this would be up in arms about using small p-values as evidence of statistical impropriety. Am I the only one wondering about this?*

CLARIFICATION (6/15/13): Corey’s comment today leads me to a clarification, lest anyone misunderstand my point. I am sure that Francis, Simonsohn and others would never be using p-values and associated methods in the service of criticism if they did not regard the tests as legitimate scientific tools. I wasn’t talking about them. I was alluding to critics of tests who point to their work as evidence the statistical tools are not legitimate. Now maybe Gelman only intends to say, what we know and agree with, that tests can be misused and misinterpreted. But in these comments, our exchanges, and elsewhere, it is clear he is saying something much stronger. In my view, the use of significance tests by debunkers should have been taken as strong support for the value of the tools, correctly used. In short, I thought it was a success story! and I was rather perplexed to see somewhat the reverse.

______________________

*This just in: If one wants to see a genuine quack extremist** who was outed long ago***, see Ziliac’s article declaring the Higgs physicists are pseudoscientists for relying on significance levels!( in the Financial Post 6/12/13).

**I am not placing the critics referred to above under this umbrella in the least.

***For some reviews of Ziliac and McCloskey, see widgets on left. For their flawed testimony on the Matrixx case, please search this blog.

Categories: reforming the reformers, Statistical fraudbusting, Statistics | 43 Comments

Mayo: comment on the repressed memory research

freud mirror espHere are some reflections on the repressed memory articles from Richard Gill’s post, focusing on Geraerts, et.al.,(2008).

1. Richard Gill reported that “Everyone does it this way, in fact, if you don’t, you’d never get anything published: …People are not deliberately cheating: they honestly believe in their theories and believe the data is supporting them and are just doing their best to make this as clear as possible to everyone.”

This remark is very telling. I recommend we just regard those cases as illustrating a theory one believes, rather than providing evidence for that theory. If we could mark them as such, we can stop blaming significance tests for playing a role in what are actually only illustrative attempts, or to strengthen someone’s beliefs about a theory.

2. I was surprised the examples had to do with recovered memories. Wasn’t that entire area dubbed a pseudoscience way back (at least 15-25 years ago?) when “therapy induced” memories of childhood sexual abuse (CSA) were discovered to be just that—therapy induced and manufactured? After the witch hunts that ensued (the very accusation sufficing for evidence), I thought the field of “research” had been put out of its and our misery. So, aside from having used the example in a course on critical thinking, I’m not up on this current work at all. But, as these are just blog comments, let me venture some off-the-cuff skeptical thoughts. They will have almost nothing to do with the statistical data analysis, by the way…

3. Geraerts, et.al., (2008, 22) admit at the start of the article that therapy-recovered CSA memories are unreliable, and the idea of automatically repressing a traumatic event like CSA implausible. Then mightn’t it seem the entire research program should be dropped? Not to its adherents! As with all theories that enjoy the capacity of being sufficiently flexible to survive anomaly (Popper’s pseudosciences), there’s some life left here too. Maybe , its adherents reason, it’s not necessary for those who report “spontaneously recovered” CSA memories to be repressors, instead they merely be “suppressors” who are good at blocking out negative events. If so, they didn’t automatically repress but rather deliberately suppressed: “Our findings may partly explain why people with spontaneous CSA memories have the subjective impression that they have ‘repressed’ their CSA memories for many years.” (ibid., 22).

4. Shouldn’t we stop there? I would. We have a research program growing out of an exemplar of pseudoscience being kept alive by ever-new “monster-barring” strategies (as Lakatos called them). (I realize they’re not planning to go out to the McMartin school, but still…) If a theory T is flexible enough so that any observations can be interpreted through it, and thereby regarded as confirming T, then it is no surprise that this is still true when the instances are dressed up with statistics. It isn’t that theories of repressed memories are implausible or improbable (in whatever sense one takes those terms). It is the ever-flexibility of these theories that renders the research program pseudoscience (along with, in this case, a history of self-sealing data interpretations). Continue reading

Categories: junk science, Statistical fraudbusting, Statistics | 7 Comments

Richard Gill: “Integrity or fraud… or just quesionable research practices?”

Professor Gill

Professor Gill

Professor Richard Gill
Statistics Group
Mathematical Institute
Leiden University
http://www.math.leidenuniv.nl/~gill/

I am very grateful to Richard Gill for permission to post an e-mail from him (after my “dirty laundry” post) along with slides from his talk, “Integrity or fraud… or just questionable research practices?” and associated papers. I record my own reflections on the pseudoscientific nature of the program in one of the Geraerts et.al., papers in a later post.

I certainly have been thinking about these issues a lot in recent months. I got entangled in intensive scientific and media discussions – mainly confined to the Netherlands  – concerning the cases of social psychologist Dirk Smeesters and of psychologist Elke Geraerts.  See: http://www.math.leidenuniv.nl/~gill/Integrity.pdf

And I recently got asked to look at the statistics in some papers of another … [researcher] ..but this one is still confidential ….

The verdict on Smeesters was that he like Stapel actually faked data (though he still denies this).

The Geraerts case is very much open, very much unclear. The senior co-authors Merckelbach, McNally of the attached paper, published in the journal “Memory”, have asked the journal editors for it to be withdrawn because they suspect the lead author, Elke Geraerts, of improper conduct. She denies any impropriety. It turns out that none of the co-authors have the data. Legally speaking it belongs to the University of Maastricht where the research was carried out and where Geraerts was a promising postdoc in Merckelbach’s group. She later got a chair at Erasmus University Rotterdam and presumably has the data herself but refuses to share it with her old co-authors or any other interested scientists. Just looking at the summary statistics in the paper one sees evidence of “too good to be true”. Average scores in groups supposed in theory to be similar are much closer to one another than one would expect on the basis of the within group variation (the paper reports averages and standard deviations for each group, so it is easy to compute the F statistic for equality of the three similar groups and use its left tail probability as test statistic. Continue reading

Categories: junk science, Statistical fraudbusting, Statistics | 5 Comments

Blog at WordPress.com.