‘You could see why someone might want to hack DNA’

Deoxyribonucleic acid, otherwise known as DNA, carries the genetic instructions for all living things, human beings included. You can think of it as biological data storage. Genetic researchers have actually used DNA to store data, such as Amazon gift cards, GIFs, and books. Now University of Washington scientists Peter Ney, Karl Koscher, Lee Organick, Luis Ceze, and Tadayoshi Kohno have been able to use DNA to store and transmit malware in a move that sounds like a plot from a William Gibson novel.

We needn’t worry about the DNA we leave behind in our saliva or hair strands infecting computers with malware. Not yet, anyway. The University of Washington researchers were able to make the execution of their DNA-encoded malware work by introducing a vulnerability into the FASTQ compression utility, a DNA data file analysis application. That may sound like cheating, but cybersecurity professionals know that there’s the possibility of security vulnerabilities in all types of software.

DNA contains units called nucleotides. Those nucleotides can have one of four different types of nucleobases- thymine (T), adenine (A), guanine (G), and cytosine (C.) The researchers were able to assign binary 1s and 0s to the nucleotides of the DNA they composed in order to create computer executable malware.

Part of the motivation for the University of Washington scientists’ malware DNA research is to prepare the cybersecurity field for the new malware methods of the future. Tadayoshi Kohno explained:

We want to understand and anticipate what the hot new technologies will be over the next 10 to 15 years, to stay one step ahead of the bad guys. It’s an emerging field that other security researchers haven’t looked at, so the intrigue was there. Could we compromise a computer system with DNA biomolecules?

The process described in the researchers’ paper begins with DNA strands in a test tube. With that, they experimentally evaluate if DNA can be used to contain malicious software. They did that by synthesizing DNA strands which contain computer security exploits. Then they observed a side channel resulting from fundamental properties of DNA sequencing technologies, and considered how the side channel could be exploited. Next, they evaluated the security of DNA processing applications. Finally, a threat model for the DNA sequencing pipeline was derived.

In a nutshell, malware is decompiled into binary data, then those 1s and 0s are assigned to the C, G, A, and T nucleobases in physical DNA. A DNA data file is made from that physical DNA. They executed the DNA data file with the FASTQ compression utility with the vulnerability they introduced to it. Through the modified FASTQ compression utility, they were able to execute malware on a computer.

I spoke to Dr Kat Arney, science writer, occasional Naked Security contributor and broadcaster, for her perspective. She’s the author of the genetics books Herding Hemingway’s Cats and How to Code a Human, and presenter of the monthly Naked Genetics podcast.

Back in the very early days of DNA sequencing, researchers would use huge slabs of gel and radioactively labelled chemicals to obtain the sequence of a piece of DNA, writing down each ‘letter’ with a pen and paper.

Then, once the process became automated from the 1980s onwards, DNA sequencing machines and analyzing programmes started to store DNA sequences as computer files. So although it’s not new to convert DNA sequences into computer files that can be read and analysed, this is the first time that someone has tried to deliberately generate a piece of malicious code that messes with those files and turns them into executable malware.

In this case, it looks like the ‘hackers’ have made a lot of workarounds to get this to work- introducing a vulnerability into the analyzing program and doing a lot of optimization of the DNA sequence encoding the malware to make it work. So it’s not something that any random bad person could go and do next week.

But it does expose important security vulnerabilities. Nobody expected someone to do this, but given the sensitive personal nature of DNA information, you could see why someone might want to hack a DNA sequencing facility. It should be a wake-up call to DNA sequencing companies and facilities to make sure this can’t become a practical reality.

Working with DNA has become a lot more affordable recently. Costs to sequence the human genome have dropped from about $100,000 in 2009 to a mere $1,000 in 2014. Maybe at some point the price will go down to less than $100, and if that happens, using DNA synthesis to do harm may be very attractive for cyberattackers.