DNA sequencing programs vulnerable to cyber attack

Monday, 14 August, 2017

Everyone knows the importance of practising good cybersecurity, but we bet you never thought your computer could be hacked using DNA.

Researchers from the University of Washington (UW) recently analysed the security hygiene of common, open-source DNA processing programs, which can be used to reveal everything from one’s ancestry to fitness levels to microorganisms that live in your gut. Unfortunately, the only thing the researchers found was a proliferation of poor computer security practices.

The team discovered known security gaps in many open-source software programs used to analyse DNA sequencing data. Some were written in unsafe languages known to be vulnerable to attacks, in part because they were first crafted by small research groups who likely weren’t expecting much, if any, adversarial pressure. But as the cost of DNA sequencing has plummeted over the last decade, open-source programs have been adopted more widely in medical- and consumer-focused applications.

The researchers even went so far as to demonstrate a hacking technique that is scientifically fascinating though arguably not the first thing an adversary might attempt — using biological molecules to infect a computer through normal DNA processing.

DNA is, at its heart, a system that encodes information in sequences of nucleotides. Through trial and error, the team found a way to include executable code — similar to computer worms that occasionally wreak havoc on the internet — in synthetic DNA strands.

This test tube holds hundreds of billions of copies of the exploit code stored in synthetic DNA molecules, which has the potential to compromise a computer system when it is sequenced and processed. Image credit: Dennis Wise/University of Washington.

To create optimal conditions for an adversary, they introduced a known security vulnerability into a software program that’s used to analyse and search for patterns in the raw files that emerge from DNA sequencing. When that particular DNA strand is processed, the malicious exploit can gain control of the computer that’s running the program — potentially allowing the adversary to look at personal information, alter test results or even peer into a company’s intellectual property.

“To be clear, there are lots of challenges involved,” said Lee Organick, a co-author on the study. “Even if someone wanted to do this maliciously, it might not work. But we found it is possible.”

So far, the researchers stress, there’s no evidence of malicious attacks on DNA synthesising, sequencing and processing services. But their analysis of software used throughout that pipeline found known security gaps that could allow unauthorised parties to gain control of computer systems — potentially giving them access to personal information or even the ability to manipulate DNA results.

“We don’t want to alarm people or make patients worry about genetic testing, which can yield incredibly valuable information,” said Associate Professor Luis Ceze, a co-author on the study. “We do want to give people a heads-up that as these molecular and electronic worlds get closer together, there are potential interactions that we haven’t really had to contemplate before.”

This output from a sequencing machine includes the UW team’s exploit, which is being sequenced with a number of unrelated strands. Each dot represents one strand of DNA in a given sample. Image credit: Dennis Wise/University of Washington.

So what’s the solution? Researchers at the UW Molecular Information Systems Lab (MISL) are currently working to create next-generation archival storage systems by encoding digital data in strands of synthetic DNA. Although their system relies on DNA sequencing, it does not suffer from the security vulnerabilities identified in the present research, in part because the MISL team has anticipated those issues and because their system doesn’t rely on typical bioinformatics tools.

Recommendations to address vulnerabilities elsewhere in the DNA sequencing pipeline include: following best practices for secure software; incorporating adversarial thinking when setting up processes; monitoring who has control of the physical DNA samples; verifying sources of DNA samples before they are processed; and developing ways to detect malicious executable code in DNA.

“There is some really low-hanging fruit out there that people could address just by running standard software analysis tools that will point out security problems and recommend fixes,” said study co-author Karl Koscher. “There are certain functions that are known to be risky to use, and there are ways to rewrite your programs to avoid using them. That would be a good initial step.”

The study results will be presented at the 26th USENIX Security Symposium in Vancouver this week.

Top image caption: This data file tells researchers what sequence their DNA had (GGGCGT, for example), as well as the quality of the read (with E being higher quality than A). The team demonstrated that it is technically feasible to place malicious code in a strand of DNA that, when sequenced in this manner, could attack the software used for analysis. Image credit: Dennis Wise/University of Washington.

DNA sequencing programs vulnerable to cyber attack

AI can detect COVID and other conditions from chest X-rays

Image integrity best practice: the problem with altering western blots

Leveraging big data and AI in genomic research

Content from other channels on our network