Supercomputer speeds up genome analysis

Monday, 03 March, 2014

Researchers from the University of Chicago Medicine have found that the process of genome analysis, which could typically take many months, can be rapidly accelerated.

Because the genome is so vast, those involved in clinical genetics have turned to exome sequencing, which focuses on the 2% or less of the genome that codes for proteins. An estimated 85% of disease-causing mutations are located in coding regions - but the other 15% come from non-coding regions, once referred to as ‘junk DNA’ but now known to serve important functions. If not for the data-processing challenges of analysis, whole genome sequencing would be the method of choice.

“Whole genome analysis requires the alignment and comparison of raw sequence data and results in a computational bottleneck because of limited ability to analyse multiple genomes simultaneously,” the researchers explained in the journal Bioinformatics.

Study author Dr Elizabeth McNally and her team took on this challenge by working with Beagle, a Cray XE6 supercomputer based at Argonne National Laboratory and said to be one of the fastest computers devoted to life sciences. Supporting computation, simulation and data analysis for the biomedical research community, it is available for use by University of Chicago researchers, their collaborators and ‘other meritorious investigators’.

The Beagle computer at Argonne National Laboratory. Image courtesy of University of Chicago Medicine.

The team used raw sequencing data from 61 human genomes and analysed it on Beagle, using one quarter of the computer’s total capacity. The supercomputer was able to achieve the parallelisation required for concurrent multiple genome analysis, so it could process many genomes simultaneously rather than one at a time.

“This approach not only markedly speeds computational time but also results in increased usable sequence per genome,” the researchers stated. “Relying on publicly available software, the Cray XE6 has the capacity to align and call variants on 240 whole genomes in ~50 h.”

The finding has immediate medical applications. Dr McNally’s Cardiovascular Genetics Clinic, for example, relies on rigorous interrogation of the genes from an initial patient as well as multiple family members to understand, treat and prevent cardiovascular problems.

“In this setting, each patient is a big-data problem,” Dr McNally said, with the range of testable mutations having radically expanded from five genes in 2007 to 50-70 genes now. She explained that at this point, it can be less expensive to sequence the whole genome - a practice which will become even cheaper with the new method.

“With this approach, the price for analysing an entire genome is less than the cost of the looking at just a fraction of genome,” Dr McNally said. “New technology promises to bring the costs of sequencing down to around $1000 per genome. Our goal is get the cost of analysis down into that range.”

Source

Related News

Blood-based biomarker can detect sleep deprivation

The biomarker detected whether individuals had been awake for 24 hours with a 99.2% probability...

Epigenetic signature helps to diagnose rare breast tumour

The current way of diagnosing phyllodes tumours is to analyse their cellular features under a...

New instrument measures cardiovascular disease biomarkers

CVD-21 enables a 'liquid cardiovascular biopsy' for quantification of multiple...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd