Genome analysis on a smartphone
The ability to read the genome has vast potential to understand human health and disease. Now, researchers at the Garvan Institute of Medical Research and UNSW Sydney have created a method to take genome analysis ‘offline’ by adapting a computer algorithm that can perform accurate analysis — with far less computer memory than current programs.
The scientists’ algorithm may make it possible to identify infectious diseases in remote locations, or at the hospital bedside, using the computational memory of devices as small as a smartphone. It has been described in the journal Scientific Reports.
Devices that can sequence entire genomes, such as the Oxford Nanopore Technologies MinION sequencers, are small enough today to clip onto a smartphone — and have already been used to track the Ebola virus in Papua New Guinea and the Zika virus in Brazil. Such devices are able to create over a terabyte of data in 48 hours, but their use has been limited because comparing or ‘aligning’ the DNA from an unknown sample to a reference database of known genomes is computationally intensive and was previously only possible with either high-performance computer workstations or an internet connection.
Now, Dr Martin Smith from the Garvan Institute’s Kinghorn Centre for Clinical Genomics and his team have published a computational method for how to reduce the amount of memory necessary to align genomic sequences from 16 to 2 GB, making it possible for analysis to be done on the spot, using the memory available in a typical smartphone.
“We’re focused on making genomic technologies more accessible to improve human health,” Dr Smith said. “They’re becoming smaller, but still need to function in remote areas, so we created a method that can analyse genomic data, in real time, on just a mobile device.”
The team adapted the Minimap2 program, which aligns DNA sequencing ‘reads’ to a reference library of known genomes. The reference library is usually sorted or indexed, which helps quickly map the sequencing reads to their corresponding positions in a reference genome.
“The challenge, so far, has been that the reference index requires too much computer memory,” Dr Smith said. “We took the approach of splitting the reference library up into smaller segments, against which we mapped the DNA reads. Once we finished mapping to the smaller segments, we pool results together and tease out the noise, much like creating a panorama by stitching together smaller photos.
“Other algorithms, which take a similar approach of splitting up the reference data, produce a lot of spurious and duplicate mappings — just like overlapping photos in the panorama. What we did in this study was fine-tune parameters and select the best mappings across several small indexes. This approach gave us similar accuracy as current standard genomic analyses, which previously required the memory available in high-performance computers.”
Dr Smith’s team compared the accuracy of their algorithm to standard genomics workflows. Their results successfully reproduced 99.98% of the alignments and, by using the smaller index segments, the team could map an additional 1% of sequencing reads.
“The potential of lightweight, portable genomic analysis is vast,” Dr Smith said. “We hope that this technology will one day be applied in the context of point-of-care microbial infections in remote regions, or in doctors’ hands at the hospital bedside.”
Patients with neurological disorders, who may have impaired movement, could in future have their...
The ATCC Genome Portal is a publicly available database of reference-quality genome sequences...
Imaging the brain's activity in various states is important to get a more accurate picture of...