Fact-checking program uncovers errors in biomedical research
Researchers have created a fact-checking program that is tackling the problem of incorrectly published biomedical research results, whether intentional or otherwise.
‘Seek & Blastn’ was developed by Professor Jennifer Byrne from the University of Sydney and Dr Cyril Labbé from France’s University of Grenoble Alpes. The program verifies the identities of published nucleotide sequence reagents (DNA and RNA constructs used to target genes) by seeking out sequences within papers and running them through a database holding the wealth of knowledge on genes to date.
“Biomedical reagents are like ingredients in cooking — you use them to discover your experimental results,” said Prof Byrne. “Doing an experiment with wrong reagents either means that you cook something different from what you thought you were cooking, or what you cook is a failure.
“Unfortunately with experiments, failures are not always as obvious as they are in the kitchen. And here we are dealing with fundamental genetic research, and other researchers are using these failures as building blocks for their own work.”
In a cohort of 155 research papers, the new fact-checker combined with manual analysis identified 25% of papers as having sequence errors. Errors uncovered included:
- Sequence reagents that are supposed to target a particular gene, but are in fact predicted to target a different gene from that stated in the publication, resulting in acquired data having nothing to do with the system under study.
- Sequence reagents that are not supposed to target any gene (as a negative control) but instead are predicted to target a human gene, meaning researchers aren’t comparing experimental data to a proper negative control.
- Sequence reagents that are supposed to target a human gene that in fact don’t seem to target any gene, which could result in experiments not working but researchers being unaware.
The researchers noted that they were testing on a suspected group of the papers, so while the figure doesn’t reflect a baseline error rate, the numbers are still startling.
“That’s quite a lot of wrong sequences in a small group of papers and there will be many more out there, unfortunately, given that nucleotide sequence reagents have been described in literally hundreds of thousands of biomedical publications,” said Prof Byrne.
Errors represented both identity errors (sequences which were completely incorrect) and typographic errors (sequences that contained the equivalent of spelling mistakes). The researchers propose that sequence identity errors could represent a particular hallmark of research fraud, and could be applied to identify fraudulent papers and manuscripts.
“Our hope is that tools like Seek & Blastn will prospectively deter publications that describe incorrect nucleotide sequence reagents and may flag existing publications so that their conclusions can be re-evaluated,” said Prof Byrne.
AI has identified features relevant to cancer prognosis that were not previously noted by...
Patients with neurological disorders, who may have impaired movement, could in future have their...
The ATCC Genome Portal is a publicly available database of reference-quality genome sequences...