The road to reproducible research

By Mansi Gandhi
Tuesday, 17 April, 2018

The reproducibility problem has plagued biomedical science for decades, but it came into the limelight in 2012 when scientists, led by Dr Glenn Begley (former Vice President, Hematology and Oncology Research, Amgen), reported that they couldn’t reproduce 47 out of 53 ‘landmark’ publications.

When the paper was published, the scientists behind the project received threats, hate mail and were told they were “stupid and incompetent”, said Begley. “When I would give seminars there used to be a lot of hostility in the audience.”

Six years later, things have changed. While the reproducibility problem still exists, people are now asking what they need to do to address the problem, said Begley, who is a strong advocate of rigour and reproducibility.

Journals have offered guidelines to authors with an emphasis on reproducibility. The National Institutes of Health (NIH) has rewritten its instructions to people applying for grants. Money is being invested in training students. Grants have been offered to people investigating reproducibility.

A lot has changed, but we haven’t yet seen an impact of those changes, he said, quickly adding that it’s probably too early to expect that to happen.

After spending 15 years in the biopharma industry in the US, Begley returned to Australia last year. Born and raised in Melbourne, Begley took up the role of CEO of BioCurate, a joint venture between Monash University and the University of Melbourne, in May 2017. Data quality and integrity are two important criteria in BioCurate’s efforts to translate research from Melbourne and Monash Universities into commercial outcomes. The company operates independently and has John Brumby as an independent board chair.

The combined research strengths of Monash and Melbourne Universities place them in the top 10 globally in a number of areas, informed Begley. The universities are in the same league as Boston University in terms of impact factor but the difference in venture capital funding is 200-fold. So research judged to be of the same quality isn’t receiving similar funding, said Begley.

Reducing irreproducibility

In order to enhance the quality of biomedical research, the Global Biological Standards Institute (GBSI) was launched in 2013, under the leadership of Dr Ray Cypess. In 2016 the organisation launched the Reproducibility2020 initiative, challenging industry stakeholders to improve the quality of research by 2020.

Leonard Freedman is the founding president and chief scientific officer of GBSI. The issues and drivers for irreproducibility are very complex, he said. There are issues around transparency, manipulating statistics, poor experimental design, poor methodologies and lack of validated reagents, said Freedman. It’s a multifaceted issue and there isn’t going to be any one quick fix, he added. The areas where action can help reproducibility include: study design and analysis; reagents and reference materials; laboratory protocols; and reporting and review, according to GBSI.

Commenting on improving reproducibility, Begley said if he could do one thing, he would insist all the experiments are blinded.

Perverse incentives

The problem is of perverse incentives, he said. For example, to receive a research grant, a researcher will need a paper in a top-tier journal. There is “no metric for quality. It’s only about flashy science. It’s only about something that’s exciting”, said Begley. We falsely believe that a publication in a top-tier journal is a measure of quality, he said.

Changes are required at multiple levels. A number of things have already begun to happen and they’ll hopefully have an impact, said Begley. The development of ‘mega labs’, ie, labs with 50 students and postdoctoral researchers, should be discouraged in order to improve quality. It’s difficult to appropriately mentor 50 people, he said.

“We need to develop some sort of metric for quality. We could have a more rigorous review system for grant application. One thing I’d like to see is people cannot submit their CV with their grant application. They can only submit the last two papers; then people would be more inclined to look at those last two papers and see if they were representing work of quality rather than something in top-tier journals.”

Another thing that could help is if the names of the authors on grants applications aren’t known to people reviewing papers and grants, including editors. There is very telling data from the National Health and Medical Research Council (NHMRC) indicating that women scientists do much more poorly on grant applications compared to male scientists. This reflects the intent of prejudice in our system, he said.

Antibody validation

Poorly validated antibodies are a major contributor to the reproducibility problem, said Freedman.

“Antibodies are key reagents in preclinical research for activities as diverse as protein visualisation, protein quantification, and biochemical signal disruption. Antibody performance is variable, with differences in specificity, reliability, and functionality for different types of experiments (eg, Western blotting and immunofluorescence), manufacturers, and lots, harming reproducibility,” wrote Freedman and his co-authors, Gautham Venugopalan and Rosann Wisman, in the report ‘Reproducibility2020: Progress and priorities [version 1; referees: 2 approved]’ in 2017.

“The Antibody Validation Initiative, involving stakeholders throughout the research community and led by GBSI, is an example that could be replicated in other scientific areas (eg, both stem cells and synthetic biology are areas where a greater emphasis on development of standards and best practices are needed to ensure quality and advance discovery),” said the authors.

Stakeholder solutions include antibody databases, such as the CiteAB database, and repositories, such as the proposed universal library recombinant antibodies for all human gene products, the authors said. “In all cases, validation is a key component of the solution.”

Research antibodies must demonstrate specificity, selectivity and reproducibility in the application or assay for which they are used, according to GBSI. A range of issues, including production variations, storage conditions, and improper validation techniques, can put even the best designed experiments at risk.

It typically takes Begley around 8–10 hours to read a scientific paper. Part of the time is spent checking reagents. He routinely goes back to see that the antibody being used in the experiment is recommended by the manufacturer. Begley recently found a paper where the investigators made a claim that manufacturers didn’t make; ie, the investigators said that the antibody was selected for a particular protein but when he went to the manufacturer’s site, he discovered that wasn’t the case. Sometimes manufacturers make claims that are not substantiated. Such issues need to be addressed, insisted Begley.

Best practices and guidelines

GBSI believes that in order to accelerate the successful translation of benchside research breakthroughs into approved diagnostics and therapies, best practices and standards must be established around the development, commercial availability and widespread use of assay-specific validated antibodies in biomedical research.

In 2016, GBSI and the Antibody Society brought together industry stakeholders to share insights and weigh in on potential solutions for validating antibodies. Following the meeting, the stakeholders agreed that there needs to be greater validation, said Freedman. But antibodies are complicated — there are multiple applications and there is no simple solution, said Freedman. As a result, it’s difficult to have standards but there have been efforts to create guidelines, he informed.

At the meeting, the stakeholders agreed on creating a tool — a set of criteria or scorecards — for consumers. Each one of these scorecards will be different depending on the application, so the scorecard for western blots will be different to immunoprecipitation.

GBSI has been testing the pilot scorecard system for six months to evaluate and rank research antibody performance. The scorecard is a measuring system with data that would allow researchers to select antibodies for a given application based on their intrinsic on-target, off-target and other technical characteristics, ultimately improving accuracy and resulting in more reproducible research.

After alpha testing the initial scorecards for three of the most used applications — western Blot, ELISA and immunohistochemistry — participating manufacturers and academics will meet to review the outcomes and determine what, if any, changes are needed, said GBSI.

“My personal wish is that there will be general adoption of the scoring system and that it becomes the standard when validating an antibody,” said Freedman. “Consumers often rely on scoring systems to make more informed purchasing decisions, as long as the scoring is straightforward and transparent. Why should the purchase of essential reagents such as antibodies be any different?”

Training and automation

Training can also have a significant impact in improving reproducibility, said Freedman. GBSI has already received $2.34 million over five years from NIH for an experimental design training project. The project, ‘Producing Reproducible Experiments by Promoting Reverse Experimental Design’ (PREPaRED), is a partnership between GBSI, Harvard Medical School, Vanderbilt University, Purdue University and Massachusetts Institute of Technology (MIT). It will take the concepts of ‘reverse engineering’ and apply them to training for experimental design.

“Sound experimental design is a core prerequisite for rigorous and reproducible research, which forms the necessary foundation for scientific breakthroughs, and yet it is not frequently taught as a formal part of undergraduate and graduate training,” said Freedman.

One of the other focus areas for GBSI is lab automation. Current lab protocols contribute around $3 billion a year to the preclinical irreproducibility issue, according to the not-for-profit body. The emergence of affordable automation tools and technology can also have a positive impact on reproducibility, said Freedman.

The road to reproducible research

Reducing irreproducibility

Perverse incentives

Antibody validation

Best practices and guidelines

Training and automation

Scientists detect most massive black hole merger to date

New coeliac disease test removes need for gluten consumption

Reptiles originated 35 million years earlier than we thought

Content from other channels on our network