You still have a few days to respond to the call of your country to solve problems of scientific reproducibility!
The following passages come from Retraction Watch, with my own recommendations at the end.
“White House takes notice of reproducibility in science, and wants your opinion”
The White House’s Office of Science and Technology Policy (OSTP) is taking a look at innovation and scientific research, and issues of reproducibility have made it onto its radar.
Here’s the description of the project from the Federal Register:
The Office of Science and Technology Policy and the National Economic Council request public comments to provide input into an upcoming update of the Strategy for American Innovation, which helps to guide the Administration’s efforts to promote lasting economic growth and competitiveness through policies that support transformative American innovation in products, processes, and services and spur new fundamental discoveries that in the long run lead to growing economic prosperity and rising living standards.
I wonder what Steven Pinker would say about some of the above verbiage?
And here’s what’s catching the eye of people interested in scientific reproducibility:
(11) Given recent evidence of the irreproducibility of a surprising number of published scientific findings, how can the Federal Government leverage its role as a significant funder of scientific research to most effectively address the problem?
The OSTP is the same office that, in 2013, took what Nature called “a long-awaited leap forward for open access” when it said “that publications from taxpayer-funded research should be made free to read after a year’s delay.That OSTP memo came after more than 65,000 people “signed a We the People petition asking for expanded public access to the results of taxpayer-funded research.”
Have ideas on improving reproducibility? Emails to innovationstrategy@ostp.gov are preferred, according to the notice, which also explains how to fax or mail comments. The deadline is September 23.
Off the top of my head, how about:
Promote the use of methodologies that:
- control and assess the capabilities of methods to avoid mistaken inferences from data;
- require demonstrated self-criticism all the way from the data collection, modelling and interpretation (statistical and substantive);
- describe what is especially shaky or poorly probed thus far (and spell out how subsequent studies are most likely to locate those flaws)[i]
Institute penalties for QRPs and fraud?
Please offer your suggestions in the comments, or directly to Uncle Sam.
[i]It may require a certain courage on the part of researchers, journalists, referees.
Audit research the same as income tax returns 😉
(Actually was in the title of the first talk I ever gave in a statistics department – Meta-Analysis: Auditing Scientific Projects and Method. University of Toronto, Department of Statistics Colloquim, 1989)
Your 3 suggestions are very good, but they get increasingly difficult for people to do (voluntarily) as you go down the list.
And as JG Gardin used to say – researchers have no business replicating their own work – a third party is needed.
Keith is exactly right that we cannot rely on researchers self-examinations as a means to improve output quality. None of my suggestions below are novel, but they seem to require regular repetition.
1. open-access analysis code
2. publishing of intermediate data products, such as those used to make figures and tables
3. make raw data available upon request, while giving due deference to priority issues
My best guess on implementation is to make as a requirement for a successful grant application, a plan for digital publishing, storage and distribution of the above resources. Naturally this can be done in collaboration with a larger institution, university or institute so as to minimize infrastructure redundancies.
West:
Thanks for these excellent suggestions. I know that after the Potti fiasco, this type of thing was/is being called for. I wonder, though, about whether some very sagacious group should try to differentiate between different research programs. Do they all need the same things?
the self examination I had in mind would not be invisible, it would have to be reported how they had conducted the stringent criticism and where we should expect to find weak spots still.
A somewhat separate issue: I read about research labs where underlings are cowed and intimidated and afraid to speak out about questionable practices. Can’t there be some kind of central safe place for whistleblowers to go?
@Mayo: I personally feel #1 & #2 should be done as a baseline requirements. It’s analogous to the “showing your work” on a math exam. And while these first two suggestions can be problematic, its #3 where the issues of data propriety and publication priority can really get tricky. But institutes of all types have been struggling with this last one for years already, so it’s not as if one needs to start tabula rasa. Funding agencies will of course have to work with PIs and their institutions to set up the framework for open-science.
I don’t know whether whistleblower protection is afforded to students & staff of academic institutions, but I’d hope so, as lab staff are *employees*.
Too many dangers I hear. Computer glitches prevents writing more.
Keith: Thanks for you comment. I use the word “audit” in my book. I think it’s reasonable to expect a demonstrated show of stringent self criticism. something more than Simonsohn’s “clap you hands” as much as I kind of like that. Fields where QRPs are common should be deemed questionable sciences, then if it continues, pseudosciences. Then “for entertainment only”.
Interesting science builds on itself so that errors ramify in other related investigations. If they don’t have ramifications we can’t learn from them.
I’m inclined to agree with Senn that the replication “crisis” is mostly an isolated phenomena enhanced by stupid use of big data.
I was reading a recent book by Prusiner on the prion research for which he finally got a Nobel. At least 90% or more of the attempts failed, so what? I frankly know of no breakthroughs where that was not the case. And this is still true with use of up-to-date genomics in prion research.There’s no fooling around here, they check and recheck eachother; the protocols are painstaking. it’s just so damn hard to find a substance that will work to control prions. And even when they do, they think very, very, very few could pass the brain blood barrier. They’re not giving up. It’s probably the only way we’re gong to combat things like Alzheimer’s. But Ioannidis fans want the goal to be positive prediction rates–start with obvious things, high prevalence of true but trivial claims. So much for challenging and difficult science.
The only sad part of his story is how much in-fighting, vituperative and unprofessional criticism he still faces. Maybe his own impressions exaggerate. Some insist prions still have hidden nucleic acids somewhere, never mind they are not killed by things that kill nucleic acids. People tried to prevent him from speaking, trick him out of being invited to conferences. It”s crazy.
This is a perfectly disjointed set of remarks which suggest that I think concerns are misplaced, but thought it would be fun to post the Uncle Sam call anyway.
While I agree fully that we need for others to replicate a study to demonstrate its objective validity, i reject the notion that original researchers can do nothing to test their own findings (which could be taken from the above comments). Validation studies designed to test the efficacy of methods used can be done by original researchers that lend greater confidence to the original paper as well as to the authors themselves. This is common in forensics and otber fields. Techniques include use of known individual specimens (eg. correct answer is known in advance) and blinding. Validation studies properly done increase the likelihood of others successfully replicating the study. Of course, another issue is whether or not other researchers can find the opportunity to replicate a study (are the samples hard to obtain?, do the measurements require special knowledge and equipment?). It would be a mistake not to encourage researchers to validate their methods themselves and include the validation results in the publication. It may be the most reliable validation in special cases.
@JohnB: I was not arguing that researchers themselves “can do nothing to test their own findings.” Merely that internal validation studies are not sufficient, which does not preclude them being attempted. NOTE: I heartily endorse Mayo’s recommendations as S.O.P. for researchers.
To clarify, my suggestions were about how to verify that the analysis of the collected data was valid. That is, if someone else runs the same code on the same data, you get the same results. This also allows one to test alternative algorithms on the same data set and see if one gets a similar answer. Validating the published results in this manner should be the first step, before anyone starts running off to redo to experiment/observations from scratch.
The question was what federal funding agencies can do to leverage their power. Making PIs create an open-access plan for their code & data as part of new grant proposals is something that is already done in certain circles and could be made a broad requirement. I cannot see how a funding agency could stipulate any methodological (not just Mayo’s) criterion without the attempt blowing up in its face.
West: All the funding agencies Ive been involved with have methodological criteria.
Mayo: Looking through the NSF’s grant proposal guidelines, http://www.nsf.gov/pubs/policydocs/pappguide/nsf14001/gpg_2.jsp, there are no explicit methodology requirements like what you describe at the end of your original post. Naturally the analysis methods will have to be justified in someway to give the reviewer confidence the work is worth funding, but that is on a proposal by proposal basis.
What funding agencies have explicit criteria on how data is to be analyzed, summarized and reported?
People in data sciences would know better particularly in connection with recent issues.
Mayo: You state that “All the funding agencies I’ve been involved with have methodological criteria.” Which agencies are those and what criteria are you required to use in your analyses? Can you provide documentation to that effect? I ask because I have never heard this before in my conversions with those writing NSF grants (physics division).
here’s a new link on the topic
http://www.nytimes.com/2014/09/19/upshot/to-get-more-out-of-science-show-the-rejected-research.html?_r=0&abt=0002&abg=1
Mayo: There is a lot of variation in how people do science between and within in fields. If you follow this link http://www.ncbi.nlm.nih.gov/pubmed/15161896 in the link you gave: you get some empirical evidence on one aspect in one field.
I was student with the first author in Oxford and an interesting point about his initial attempts to get access to the REB study proposals so he could track adherence to them was that almost all of the REBs said it would be unethical to grant him access. Except for a couple ‘less ethical” REBs we would not have any empirical evidence of this type of research behaviour.
West: From my own experience there is no whistleblower protection in academia, those who complained about their senior colleagues not being careful enough or over interpreting results were failed, fired or not re-hired. My advice to people in such situations is not to try to do anything before they have support from someone senior in their research institute/university who can deal with it on their behalf.
Keith: “almost all of the REBs said it would be unethical to grant him access.” – out of curiosity: what was the argument why this was unethical? (Was there any you’re aware of?)
As best as I can remember it was not having informed consent from the study authors…
The first REB to agree, thought it would be less ethical to decline an opportunity to find out how often authors actually did what they said they would.
I think the problem mostly comes from poor scientific training and mentorship, particularly when it comes to hypothesis testing (both scientific & statistical). The culture is so incredibly results driven. The stakes have always been high for both the mentor and the trainee, and this has worsened with the austere funding climate over the last decade or so. Taken together, the environment has promoted the publication of scientific minutia and trivia….a LOT of nonsense results.
I’m both a researcher and a teacher of biostatistics. I teach the latter as a framework for experiments–a scaffold of decision making rules for what you can/should and cannot/shouldn’t do with your data, beginning with conceptualization of the experiment on through inference and presentation of results and conclusions. I spend an inordinate amount of time on bias. Don’t even get me started on outlier tests. Ha!
If I had one suggestion for a fix it would be to require every federal grant recipient to give a modest percent effort to an independent statistical consultant who has considerable expertise in experimental design and statistical analysis.
But it is a tale as old as time, statisticians are rarely consulted until its too late. Too often, those so advised walk out of the office with a pile of useless data accumulated over months or even years of effort, facing a rather profound ethical conundrum. Guess what happens with that data?