Multi-word passphrases not all that secure, says Cambridge University

login screen

Cambridge UniversityThink that a passphrase of multiple, random dictionary words is as unguessable as long strings of gibberish, but easier to remember?

Research from the Computer Laboratory at the University of Cambridge suggests that this might not be so.

While passphrases using dictionary words may not be as vulnerable as individual passwords, they may still be cracked by dictionary attacks, the research found.

Security researcher Joseph Bonneau reports, in a recent paper written with Ekaterina Shutova, that his team studied the problem by turning not to the theoretical space of choices but rather the real-life passphrases that people actually string together.

To find such a selection of passphrases, his team used data crawled from the now-defunct Amazon PayPhrase system, introduced last year for US users only.

The goal wasn’t to evaluate the security of the scheme as deployed by Amazon, Bonneau says, but rather to learn more about how people choose passphrases in general.

Amazon’s was “a relatively limited data source”, he writes, but the research results do “suggest some caution on this approach”.

In the original version of the Amazon site, passphrases had to be at least two words long. Error messages indicated when a passphrase was already in use.

Amazon Passphrase

The first experiment was a dictionary attack using lists of movie titles, sports team names, and dozens of other types of proper nouns crawled from Wikipedia, along with idiomatic phrases crawled from sources including Urban Dictionary.

Passphrase attack

Here’s what the researchers said:

We found about 8,000 phrases using a 20,000 phrase dictionary. Using a very rough estimate for the total number of phrases and some probability calculations, this produced an estimate that passphrase distribution provides only about 20 bits of security against an attacker trying to compromise 1% of available accounts. This is far better than passwords, which are usually under 10 bits by this same metric, but not high enough to make online guessing impractical without proper rate-limiting.

login screenThe debate about how easily dictionary attacks can break passphrases is interesting. I am not adept at the mathematics involved, but random word passphrases certainly do have their proponents.

Take, for example, the Slashdot discussion on this issue.

A random selection of commenters’ thoughts on the entropy (i.e., the password strength/resistance to brute-force searching) of common-word passphrases:

  • »IMHO, you CANNOT use straight dictionary words (regardless of language, and yes, I do mean Klingon and Sindarin!) in your passwords without some sort of numeric or symbolic character replacement pattern.
  • »Of course you can. If they're selected randomly, an attacker has to use the complete source space for the random selection in a brute force attack.
  • »diceware.com gives you 12.9 bits of entropy per word. Brute forcing that is already more trouble than it's worth at three words, and five would require nation-state resources to crack.

These issues are delightful and productive to ponder for those with a love for password generation nuance, but most laypeople just want to know how to choose a safe password.

We don’t want to have to remember crazy combinations of uppercase and lowercase and random words with letters swapped out Leetspeak-ishly, plus of course the added special character &$!! or two and some digits glued to the bottom. (See xkcd for the graphic representation of the insanity this causes.)

Password security discussed on XKCD

The research takeaway is that while passphrases are safer than passwords, they’re not all that safe, depending, of course, on length.

Length is another matter entirely. Paul Ducklin and Chester Wisniewski discuss passwords and complexity in detail in a recent Sophos Techknow podcast:

(11 March 2012, duration 14’35”, size 10.5MBytes)

“[The password myth] that annoys me the most [concerns] Leetspeak,” Chester said in the password podcast. “They pick a nice word, and they say, ‘Well, it’s not a dictionary word. I added 0 instead of o.’ But most password-cracking apps try that right off the bat, because they know how much people rely on this false sense of security from complicating their password.”

But combining passphrase abbreviation with Leetspeak combines the best of random characters mixed with the implicit, coherent meaningfulness of a phrase.

The debate over whether passphrases are guessable seems moot in the face of this user-friendly approach.

I’m not saying that because I write for Naked Security; I’m saying it because I’ve found it actually works.

Using this hybrid approach, I can call to mind random strings of characters reaching a dozen or more characters which, when I decipher them, form phrases that are simple for me to associate with important sites: for example, that of my neighborhood bank.

If you’re not convinced that this is the best approach, either for you or your end users if you set organizational password policy, I’m curious to hear your thoughts on how you approach password generation. So please, comment away.