An update on the Diederik Stapel case: July 2, 2013, The Scientist, “Dutch Fraudster Scientist Avoids Jail”.
Two years after being exposed by colleagues for making up data in at least 30 published journal articles, former Tilburg University professor Diederik Stapel will avoid a trial for fraud. Once one of the Netherlands’ leading social psychologists, Stapel has agreed to a pre-trial settlement with Dutch prosecutors to perform 120 hours of community service.
According to Dutch newspaper NRC Handeslblad, the Dutch Organization for Scientific Research awarded Stapel $2.8 million in grants for research that was ultimately tarnished by misconduct. However, the Dutch Public Prosecution Service and the Fiscal Information and Investigation Service said on Friday (June 28) that because Stapel used the grant money for student and staff salaries to perform research, he had not misused public funds. …
In addition to the community service he will perform, Stapel has agreed not to make a claim on 18 months’ worth of illness and disability compensation that he was due under his terms of employment with Tilburg University. Stapel also voluntarily returned his doctorate from the University of Amsterdam and, according to Retraction Watch, retracted 53 of the more than 150 papers he has co-authored.
“I very much regret the mistakes I have made,” Stapel told ScienceInsider. “I am happy for my colleagues as well as for my family that with this settlement, a court case has been avoided.”
No surprise he’s not doing jail time, but 120 hours of community service? After over a decade of fraud, and tainting 14 of 21 of the PhD theses he supervised? Perhaps the “community service” should be to actually run the experiments he had designed? What about his innocence of misusing public funds?
From Retraction watch:
The FIOD (Dutch organisation that pursues tax, benefit and other frauds) states that Stapel has not misused public funds, because they were given for doing research that was performed. Most of it was spent on salaries and the personnel paid performed research. Although this was with made-up data, the implicated personnel did not know this.
True, the personnel did not know, but Stapel did. See the Tilberg Report in my “statistical dirty laundry” post. You might say, well there is no hope of recouping the money anyway, but that scarcely prevents many fraud cases. What do you think?
In pondering this, consider another blog post where I asked: “even if reporting spurious statistical results is considered ‘bad statistics,’ is it criminal behavior?” Many comments to that (12/13/13) post were tending to No. (One comment raised the fear that doing statistics might require having a lawyer!) That case concerned a biotech company, InterMune, and its previous CEO, Dr. Harkonen. I gave an excerpt from lawyer Nathan Schachtman’s blog:
In August 2002, Dr. Harkonen approved a press release, which carried a headline, “phase III data demonstrating survival benefit of Actimmune in IPF.” A subtitle announced the 70% relative reduction in patients with mild to moderate disease. ……
….The prosecution asserted that Dr. Harkonen engaged in data dredging, grasping for the right non-prespecified end point that had a low p-value attached. Such data dredging implicates the problem of multiple comparisons or tests, with the result of increasing the risk of a false-positive finding, notwithstanding the p-value below 0.05.
Supported by the testimony of Professor Thomas Fleming, who chaired the Data Safety Monitoring Board for the clinical trial in question, the government claimed that the trial results were “negative” because the p-values for all the pre-specified endpoints exceeded 0.05. Shortly after the press release, Fleming sent InterMune a letter that strongly dissented from the language of the press release, which he characterized as misleading. Because the primary and secondary end points were not statistically significant, and because the reported mortality benefit was found in a non-prespecified subgroup, the interpretation of the trial data required “greater caution,” and the press release was a “serious misrepresentation of results obtained from exploratory data subgroup analyses.”
The district court sentenced Harkonen to six months of home confinement, three years of probation, 200 hours of community service, and a fine of $20,000. Dr. Harkonen appealed on grounds that the federal fraud statutes do not permit the government to prosecute persons for expressing scientific opinions about which reasonable minds can differ. Unless no reasonable expert could find the defendant’s statement to be true, the trial court should dismiss the prosecution. .… The government cross-appealed to complain about the leniency of the sentence….. (my emphasis)
Read the rest of Schachtman’s post. [For the infamous press release to investors, and a link to a time line of this case, see: https://errorstatistics.com/2012/12/14/philstatlaw-bad-statistics-cont/ .]
Harkonen is awaiting his last appeal: the Supreme court in August–stay tuned. [Update: he lost his final appeal!]
Can the Stapel and Harkonen cases be compared? (Not obviously.) I’m no lawyer, and have only read some published reports.[i] Some “two handed” reflections: On the one hand, it might be said that scouring subgroups for nominally significant endpoints is done all the time, and there’s nothing like data fabrication going on here. True, but on the other hand, unlike Stapel, Harkonen was in a position to (and did) influence doctors, not to mention manipulate the company’s share price.[ii] The company itself agreed to pay some millions (their own scientists did not approve Harkonen’s report). Also unlike Stapel, Harkonen is ensconced in some other drug company. Still, the latter’s punishment might make the former case appear too lenient. Or not?
Any comparison will hinge on one’s philosophy of punishment. It might be argued that little good would result from further punishing Stapel, but what about from making an example out of Harkonen? Using the latter measure, I might be inclined to agree with the government’s complaint about leniency…maybe replace “jail” for “home confinement” in the sentence sentence? (I know what Schachtman thinks and has legalistically argued.) What say you?
[i] In a discussion of his most recent, failed, appeal:
“At trial,” the appellate court observed, “nearly everybody actually involved in the … clinical trial testified that the Press Release misrepresented [the trial’s] results,” noting that the evidence showed that “even Harkonen himself was ‘very apologetic’ about the Press Release’s misleading nature.” The appellate court then determined, as did the trial court, that the evidence at trial supported the jury’s findings that Harkonen was aware that the release was misleading and that he acted with the specific intent to defraud. http://www.jdsupra.com/legalnews/ninth-circuit-affirms-conviction-in-hark-69966/
[ii] I never held shares of InterMune.
Me neither (never held shares of Actimmune). I think it is important to understand that the gov’t was mostly concerned about prosecuting off-label promotion by Dr. Harkonen. I won’t comment on that count of the indictment other than to say that the jury acquited on that count, and thus removed the issue from the case. The jury convicted on the wire fraud count, which is why the case is still with us.
There are important things to keep in mind in comparing Harkonen’s case with Stapel’s. First, Harkonen’s clinical trial was, by everyone’s account, including the government and its witnesses, a well-conducted study. There was no “funny business” with loss to follow up on the intent to treat analysis. Second, Harkonen did not fabricate or falsify any data. Third, the statistical analyses were accurately done and reported. Everything in the gov’t’s case turns on the use of the word “demonstrate” to describe the inference from a subgroup (p = 0.004) not pre-specified. Not best practice, but also not a crime. This was, afterall, a press release, not a government submission for a new drug application. The press release promised that there would be a full presentation of data within a couple of weeks, at a pulmonary medicine conference in Europe, and that was indeed done.
Why has the gov’t pursued this case? The U.S. has taken the extreme position that there is no first amendment right to speak of off-label uses (a position it lost in U.S. v. Caronia, before the Second Circuit). Having lost that issue with the jury, the prosecutor is trying to do with the wire fraud statute what it failed to do with the FDCA criminalization of misbranding.
Nathan: The legal beagle is sure quick! Just on the last para of your comment, I thought the distinction was (is becoming?) that there are first amendment rights for true off-label speech, and that it was the determination of “intent to defraud” that made the difference. I’ll have to look up what I read some weeks ago on this. Actually, perhaps this issue is in flux at the moment. I agree it would be extreme to bar communicating valid results about an off-label use of a treatment. Some things are never “on label”.
Tax inspections are very useful not so much for the money the I.R.S sucks out of them, but because everybody else is scare to be the next one… and that’s a lot of money.
It just might work as well in the scientific community. How about mandatory audits where the researchers must show to a jury the step by step process in the making of their papers?
And we do not need many audits for this to work actually… Paranoia works great.
Fran: I’m not sure I get your point. Stapel faced no jury, but of course already admitted to fabricating the data. Harkonen faced a jury but already admitted to data-dependent subgroup analysis.His position is that the latter is not fraud but free speech–to put it very coarsely.
A subgroup analysis that wasn’t part of the experimental design certainly shouldn’t be interpreted as confirmatory evidence.
However, doing such an analysis is no worse than any analysis of available data outside the context of a study design. If subgroup analyses are going to be disallowed, then one might as well disallow analyses of observational data as well.
I think a better approach is not to avert our eyes from available information, but to use methods which don’t result in underpowered comparisons (e.g. shrinkage-based approaches) and to report them for the convenience sample analyses that they are – don’t pretend like one has a confirmatory result just because p<.05.