Naked Security

Yahoo breach: I’ve closed my account because it used MD5 to hash my password

This morning I received an email from Yahoo entitled “Important Security Information for Yahoo Users”. Five minutes later I’d closed my account.

The email was Yahoo’s admission that I was one of 1bn victims in a data breach of staggering proportions.

It wasn’t that this could be the biggest breach ever that pushed me over the edge and made me close my account. Nor was it that this was the second mega-breach that Yahoo has fessed up to just this year.

It wasn’t even that Yahoo apparently had no idea that this three-year-old breach had even occurred until law enforcement told them about it (although that certainly helped to convince me).

The straw that broke this camel’s back was this section of Yahoo’s email:

The stolen user account information may have included names, email addresses, telephone numbers, dates of birth, hashed passwords (using MD5) and, in some cases, encrypted or unencrypted security questions and answers.

There’s plenty to get your teeth into there, and plenty to be mad about (date of birth in the wild for 3 years…) but I’m a password nerd and I was drawn to hashed passwords (using MD5).

MD5 is a hashing function. The idea is that no matter what you feed it, whether it’s an eight-character password or the complete works of Shakespeare, you’ll get a pseudorandom 128-bit hash value out of it.

What makes hashes useful for password storage is that those outputs aren’t reversible: if you know the password you can calculate the hash but if you know the hash there’s no way to unravel it back in to the original password.

Hashing allows websites to check that your password is correct without actually having to store it. So long as the hash of the password you use when you login matches the hash on record, you must have entered the correct password.

MD5 isn’t a good choice for this kind of hashing because in reality it doesn’t produce truly random hashes, and it’s possible to create MD5 “collisions” where two different inputs produce the same hash.

Its use has been discouraged in favour of better hashing functions for two decades.

But that isn’t why I closed my account.

I didn’t close my account because Yahoo used MD5 rather than a more collision-resistant hashing function. I’d still have closed it if Yahoo had said it was using SHA-3.

I closed it because a plain old hash by itself isn’t really enough to keep my password a secret.

A crook who steals a database of password hashes has to guess at the passwords it might contain. The process is something like this: guess a password, pass it through a hashing function and see if the resulting hash matches anything in the database.

The guesses the criminal makes are important but speed is king. The more guesses they can make, the more passwords they’ll uncover.

In an offline attack against a victim who has no idea their password has been stolen the criminals hold most of the cards. They can use whatever specialist password cracking hardware they can get their hands on and they have all the time in the world.

Effective password storage is about making your password too difficult, too time-consuming or too expensive to be worth bothering with even if your adversary can generate hundreds of billions of hashes per second.

By themselves hashes just don’t pose enough of a barrier. Instead they should be used as one component in a more complex “salt, hash and stretch” password storage routine like PBKDF2, bcrypt or scrypt.

Salting adds a unique secret to your password so that if even if somebody else is using it you’ll still have different hashes. Crooks will have to make two successful guesses to crack two identical passwords, not one. It also stops attackers from using lists of pre-computed hashes, because they now need a hash lookup list for every possible salt.

Stretching means repeating the hashing process over and over and over again, usually many thousands of times for each password.

To see the difference that password storage choices, just look at Ashley Madison. When the adultery website was breached, its password database was picked up by security researchers. Some passwords were stored as MD5 hashes and others were salted, hashed and stretched using bcrypt.

A lot of security researchers simply didn’t try to crack the bcrypt hashes, but one blogger who did managed to recover 4,000 passwords after seven days of 24-hour password cracking. A different blogger spent 10 days cracking the MD5 hashes and successfully guessed 11 million of them.

So, sure, Yahoo might have been using MD5 as the hashing algorithm at the heart of a salt, hash and stretch routine, and if they did, why not say so? Would you use a phrase that’s already used to describe a popular but ineffective form of password storage to describe one that isn’t?

The company left me having to interpret its words and my filter for that was all that has gone before. There was no doubt. From upgrading RSA keys at the last minute to delivering TLS years after its competitors, Yahoo has made a habit of being late to the party.

In the context of that tardiness, the idea that in 2013 Yahoo was using password storage from a bygone era does not seem far-fetched. Plenty of others were doing the same.

To close your Yahoo account, as I have done, read Yahoo’s guide to closing your account. If you do close your account, be warned that it takes 90 days so be sure to change your password as well.

For the definitive Naked Security guide to password storage, take a look at Paul Ducklin’s How to store your users’ passwords safely.