Here’s an intriguing cryptocoin story – one that was three years in the hatching.
At the World Economic Forum Annual Meeting in 2015 – you probably know it as “Davos”, because that’s where it’s held – a biomedical researcher called Nick Goldman gave what you might call an improbable paper.
Goldman’s academic papers tend to have names such as More on the Best Evolutionary Rate for Phylogenetic Analysis and Maximum likelihood inference of small trees in the presence of long branches.
(We needed a research sabbatical just to figure out the titles. Turns out that second paper is about statistics, not arboriculture.)
But Goldman’s Davos 2015 paper was rather different, in a cool sort of way: he was at Davos to promote the idea that DNA – the funky spiral molecule that encodes life as we know it – can also be used as a reliable medium for long-term data storage.
If that sounds like a weird idea, it is – and you may be wondering how it could ever turn into anything more than an overpriced intellectual pretension and the prestige of a speaking gig at Davos.
Where did all that data go?
There’s more to it than just showing off, however, namely that we don’t yet have any truly reliable long-term data storage system – not even for tiny amounts of data.
Sure, we have rock carvings, cave paintings, even books, that have already survived hundreds, thousands, tens of thousands of years.
Perhaps that’s an existence proof that we already known how to keep things safe for millennia?
But – it seems rather feebly tautological to say this, but it’s important – we only have the ones we still have, by accident rather than by design.
Most of the data from yesteryear has vanished for ever, worn away by wind and water, leached by corrosive minerals, burned by censors, lost through carelessness, or simply rotted away.
We have a lot more data we want to store these days – all those Bitcoin transactions in the ultra-redundant copies of the blockchain, for a start – and yet we still don’t have a long-term storage medium we know we can rely on.
Who can be sure how long CDs will really last, for example?
Will those backups you burned in 1999 actually last for 100 years, assuming someone still has the software to read the files on them? (CDs only came out in 1982, so no one has done this yet.)
How long will hard drives keep their magnetism?
How long until your flash drives lose their electrons and your data turns into digital detritus?
We can store truly huge amounts of data for quite a long time, but we may be unable to store even quite modest amounts for a truly long time.
This issue is entirely relevant to computer security, which operates under the so-called holy trinity of confidentiality, integrity and availablity – or, more alliteratively, secret, safe and sound.
What about DNA?
DNA molecules don’t last for ever, to be sure, but we can make reliable estimates of how long they can last if stored in modestly controlled conditions – estimates that are helped by the fact that we have natural DNA samples that we can still sequence, and that are very ancient indeed.
DNA is also compact, and can be fairly reliably copied, making multiple redundant backups rather easy: once you’ve converted your precious data into a beaker of DNA dust, you will have zillions of copies of your data, all mixed up together.
Give 1000 different people a scoop of your data dust to take away, and you’ve figuratively, if not quite literally, scattered your data around the globe for safe keeping.
As long as it’s encoded consistently, so it can be strung back together reliably, and as long as it is encrypted if you want to keep the contents private (or not, if you want to distribute it as an anti-censorship measure)…
…data encoded as nucleotide sequences might be just the sort of archival system that the world has been wanting for years.
Anyway, back to Davos 2015.
By way of adding a memorable keepsake to his 2015 presentation, Goldman took BTC 1 (one bitcoin, then worth about $200), sequenced its private key cleverly into DNA dust, and gave each delegate their very own copy of the bitcoin, sealed in a sample tube:
In the sort of challenge that techies love to do “because they can”, Goldman said that if anyone could read out the bitcoin within three years, they could keep it.
(Actually, he couldn’t have stopped anyone taking said Bitcoin after reading the DNA molecules – once you get the private key, you have the cryptographic secret needed to spend it, and that’s that.)
Here is the aforementioned Bitcoin now:
As you can see, someone spent it!
And that’s because Sander Wuyts, who describes himself as a computational microbiologist and “a real DNA-junkie”, completed Nick Goldman’s proof-of-concept challenge.
Nearly three years after Goldman wrote his data out into the DNA sample, Wuyts managed to read it back – just in time to spend the hidden bitcoin himself.
Given the meteoric rise in BTC’s value in the last year, Wuyts ended up with much more than just a $200 keepsake.
Note that this wasn’t a cryptographic challenge. Goldman himself scrambled the input data with what he called a “random keystream“, and some reports have taken this to mean that Wuyts had to crack a cipher to beat the challenge, and thus that this was a cryptographic puzzle. In fact, there was no encryption involved – we’re assuming the randomisation phase of the data processing was simply to ensure that repetitive chunks of input data didn’t cause overly repetitive molecular structure in the DNA. We are guessing that having “more random” nucleotide sequences in the DNA increases its longevity and improves the reliablity of sequencing it to read out its contents. The “random key” part was a detail of encoding, not of encipherment.
DNA disk drives any time soon?
Don’t get too excited just yet.
Wuyts just happened to have access to state-of-the-art genome sequencing gear, a coterie of DNA experts, and the financial sponsorship of the company that made the sequencer he used, presumably because of the neat publicity they could expect if he were to succeed. (Sequencers are expensive to run as well as to buy.)
Nevertheless, as a proof-of-concept, it’s a fascinating outcome.
Was it really the sort of issue that you might expect the World Economic Forum to consider?
Not really: we’re not going to see falling sea levels, soaring economies and clean water for everyone as an outcome of this.
But no matter how sceptical you might be of solving “developed world problems” of this sort…
…we really don’t have a known-good way of storing our precious data for future centuries, even though we’re talking of going to Mars.
We apologise if you were expecting a plain-English explanation of the computer science parts of the DNA encoding used here. In this article, we wanted to focus on the why of the challenge, rather than the how. If you’d like to see a follow-up that looks at the algorithmic aspects of this story, please let us know in the comments below and we’ll see what our Editor-in-Chief thinks of the idea…