Multi-word passphrases not all that secure, says Cambridge University

Filed Under: Cryptography, Featured, Privacy, Vulnerability

Cambridge UniversityThink that a passphrase of multiple, random dictionary words is as unguessable as long strings of gibberish, but easier to remember?

Research from the Computer Laboratory at the University of Cambridge suggests that this might not be so.

While passphrases using dictionary words may not be as vulnerable as individual passwords, they may still be cracked by dictionary attacks, the research found.

Security researcher Joseph Bonneau reports, in a recent paper written with Ekaterina Shutova, that his team studied the problem by turning not to the theoretical space of choices but rather the real-life passphrases that people actually string together.

To find such a selection of passphrases, his team used data crawled from the now-defunct Amazon PayPhrase system, introduced last year for US users only.

The goal wasn’t to evaluate the security of the scheme as deployed by Amazon, Bonneau says, but rather to learn more about how people choose passphrases in general.

Amazon's was "a relatively limited data source", he writes, but the research results do "suggest some caution on this approach".

In the original version of the Amazon site, passphrases had to be at least two words long. Error messages indicated when a passphrase was already in use.

Amazon Passphrase

The first experiment was a dictionary attack using lists of movie titles, sports team names, and dozens of other types of proper nouns crawled from Wikipedia, along with idiomatic phrases crawled from sources including Urban Dictionary.

Passphrase attack

Here's what the researchers said:

We found about 8,000 phrases using a 20,000 phrase dictionary. Using a very rough estimate for the total number of phrases and some probability calculations, this produced an estimate that passphrase distribution provides only about 20 bits of security against an attacker trying to compromise 1% of available accounts. This is far better than passwords, which are usually under 10 bits by this same metric, but not high enough to make online guessing impractical without proper rate-limiting.

login screenThe debate about how easily dictionary attacks can break passphrases is interesting. I am not adept at the mathematics involved, but random word passphrases certainly do have their proponents.

Take, for example, the Slashdot discussion on this issue.

A random selection of commenters' thoughts on the entropy (i.e., the password strength/resistance to brute-force searching) of common-word passphrases:

  • »IMHO, you CANNOT use straight dictionary words (regardless of language, and yes, I do mean Klingon and Sindarin!) in your passwords without some sort of numeric or symbolic character replacement pattern.
  • »Of course you can. If they're selected randomly, an attacker has to use the complete source space for the random selection in a brute force attack.
  • »diceware.com gives you 12.9 bits of entropy per word. Brute forcing that is already more trouble than it's worth at three words, and five would require nation-state resources to crack.

These issues are delightful and productive to ponder for those with a love for password generation nuance, but most laypeople just want to know how to choose a safe password.

We don't want to have to remember crazy combinations of uppercase and lowercase and random words with letters swapped out Leetspeak-ishly, plus of course the added special character &$!! or two and some digits glued to the bottom. (See xkcd for the graphic representation of the insanity this causes.)

Password security discussed on XKCD

The research takeaway is that while passphrases are safer than passwords, they're not all that safe, depending, of course, on length.

Length is another matter entirely. Paul Ducklin and Chester Wisniewski discuss passwords and complexity in detail in a recent Sophos Techknow podcast:

(11 March 2012, duration 14'35", size 10.5MBytes)

Personally, I was long ago converted to the passcode generation scheme put forth by Graham Cluley, depicted in this video:

(Enjoy this video? You can check out more on the SophosLabs YouTube channel and subscribe if you like)

Graham's approach is a user-friendly method that combines not random words but rather the first letters of a personally significant passphrase, peppered with Leet swappage: i.e., 4 for A, 0 for o, 3 for e, etc.

And thus is the word Leet itself rendered by Leetspeak as 1337.

As many have pointed out, Leet is too predictable to use on simple dictionary words. Everybody already knows the common character swaps, and there are Leet dictionaries out there that can be used for attacks.

"[The password myth] that annoys me the most [concerns] Leetspeak," Chester said in the password podcast. "They pick a nice word, and they say, 'Well, it's not a dictionary word. I added 0 instead of o.' But most password-cracking apps try that right off the bat, because they know how much people rely on this false sense of security from complicating their password."

But combining passphrase abbreviation with Leetspeak combines the best of random characters mixed with the implicit, coherent meaningfulness of a phrase.

The debate over whether passphrases are guessable seems moot in the face of this user-friendly approach.

I'm not saying that because I write for Naked Security; I'm saying it because I've found it actually works.

Using this hybrid approach, I can call to mind random strings of characters reaching a dozen or more characters which, when I decipher them, form phrases that are simple for me to associate with important sites: for example, that of my neighborhood bank.

And, of course, as Graham's video points out - you can use password management software to remember your passphrases securely if you can't remember them.

If you're not convinced that this is the best approach, either for you or your end users if you set organizational password policy, I'm curious to hear your thoughts on how you approach password generation. So please, comment away.

, , , ,

You might like

23 Responses to Multi-word passphrases not all that secure, says Cambridge University

  1. Ted Lilley · 947 days ago

    Thanks for the comprehensive look at password security. While the passphrase method has been praised for adding length, the fact is that it is rarely criticized for shrinking the domain from which its elements are taken. A passphrase is much less secure compared to a non-dictionary password of the same length for a dictionary attacker.

    Fortunately, you only need to make a trivial change to the words in order to take them out of the dictionary. 133t-speak is, as you note, too widely known to be useful. I like using other, non-substitution symbols and punctuation to simply *break* the words on non-word borders, thus taking the fragments out of the dictionary. The extra characters do the double-duty of adding entropy (chosen randomly versus 133t) as well as adding characters which are required by security-conscious sites like banks.

    However, I think when we talk about passwords, we make two mistakes: we expect far too much of the average user, and at the same time we make far too little requirement of the password scheme. Not only should it be easier to remember than all any scheme I've seen so far, but should also allow the user to *remember* different passwords for every site they visit. And be relatively secure, at least, more secure than their old password method.

    I describe the system I use and how I came to it after trying all of the other methods here: http://transfermodeawesome.posterous.com/pretty-g...

    Not saying it's a panacea, but it works for me. Thanks!

    • The approach that works for me - and what I think is the closest we can get to the 'expect less from users and more from the password scheme' - is to use a password manager loaded with unique, randomly generated > 16 character passwords for each application. It has an obvious and fairly critical weakness in that everything is protected by one umbrella password but that's the weakness I feel I can most easily protect.

      M.

  2. Ethan · 947 days ago

    That how iNve been doing it for years.

  3. Guy C · 947 days ago

    Hah! As soon as I started reading this, I was thinking of XKCD's "Correct Horse Battery Staple", even before I saw it was referenced in the article!

    More on topic though, what are the concerns then in trying to choose a secure word or phrase in a company that also requires regular changes? Generally I have found when talking to people in one of my last jobs that they ended up choosing incremental passwords, using the same word or phrase and just substituting the current month or incrementing a number, essentially using most of the same password as previously.

  4. Sizzle · 947 days ago

    Sow, Peepel hoo spel badlee hav gud parsswerdz?

  5. Curious in LA · 947 days ago

    Interesting and helpful video. A couple of questions: Are you suggesting that we use a different sentence "Fred and Wilma...." with the attendant character substitutions for EACH password, or is it sufficient to vary the substitutions and re-use the sentence at least for a couple of passwords? Second, how does one manage all these passwords with multiple machines - say, an home computer, a laptop, an iPhone, a Nook?

  6. David Pottage · 947 days ago

    All the linked article tells us, is that users often pick bad passphrases, just as they often pick bad passwords.

    On the other hand, if a sysadmin enforces a policy of completey random passwords then most users will strugle to rember them, and will write them down, or use a lot of help desk time on password resets etc, but if the sysadmin has a policy of random passphrases (eg diceware), then it is much more likey that users will rember them.

    In other word the solution to weak passwords is user education (as allways), and to allow and encorage users to use long passphrases.

  7. I like the hybrid approach, but there is a big detail not taken into account: restrictions placed by the site itself. Most places will require a slew of different requisites for their passwords, from a minimum to a maximum length (or both), whether you can have caps, symbols and how many and which symbols are permitted.

    It is impossible to conceive of a sane, logical approach to password management while limited by these arbitrary parameters.

    So it's "pA$$w0rdoo1" for all my sites for me.

  8. Lisa Vaas · 947 days ago

    Ted, thanks so much for the input. Guy, thank you for bringing up the regular-password-changing requirement. My advice on how to deal with that is to suggest you have the security people listen to the Techknow podcast by Chester and Paul (mentioned in my article). The idea, Chet tells us, that regular password changes introduce more security is a myth that dates back to the days when passwords were stored in plain text files on Unix systems. Regular password changes actually decrease security, for a few reasons: 1) the poor users are going to start using sucky passwords because they're easy to remember and to increment (password12, password13, etc.--is it any wonder people opt for these easily predictable passwords?), and 2) doing something security-related on a regular, predictable schedule (quarterly? monthly)? is a gift to hackers. Plus it distracts the IT department for a predictable chunk of time on a predictable schedule.

    I highly suggest you listen to the podcast, as P&C have other great password myth debunking tips, and I am just feebly rehashing what I remember (no pun intended, heh. heh.).

  9. Jason · 946 days ago

    Has anyone studied the effect of inserting numbers and special characters in the middle of pass phrases? See spot run see spot jump = ssR7$@ssJ8@$

    I'd like to know how long it would take to crack that with a dictionary attack.

  10. Mike · 946 days ago

    Suggesting to the BOFH that he has configured the system wrong isn't always going to end well ;-)

    Unfortunately most of the problem is not the people who worry about the level of entropy in their passwords. The problem is everyone else. Most of the responsibility here is with the systems designers that permit their users to use bad passwords. Hopefully the outcome of research like this is that trending phrases are added to filters for password selection so that when a user puts "AngelinaJoliesleg" in as a password thinking that they are brilliant and random and unique the system will let them know it is a bad password and they should try a different method of deciding a password.

    I think the first step is to change the terminology from password to pass phrase, that way people are more likely to use longer passwords. As pass phrases get longer it seems reasonable that they would become more divergent.

  11. Helen · 946 days ago

    No-one seems to be looking at this from the perspective of the average user. I'm an ordinary soul who uses the Internet for shopping, email and banking. I recently worked out that I have over 50 online and telephone banking accounts - all of which require a password.

    Honestly, in the real world, what is an ordinary person supposed to do to remember 50 passwords? They will do what everyone does - have two or three passwords and swap them round. That's all you can honestly expect someone to do.

    What you folks do works brilliantly in one user application. But no-one ever seems to discuss what this means in a multi-account world. And we have engineered ourselves into a world driven by passwords that no ordinary human has any hope of remembering.

  12. Bob · 946 days ago

    This discussion seems pointless to me. Password management software is free and secure. I use KeePass2, and I used AnyPassword before that, going back a decade. I have over 100 passwords, they are all different, I didn't have to think of them (pseudo-random password creation is provided as well), they all have as high a level of security as the application permits, and I don't remember any of them. Just the one to open the password store, which is quite tough - and unless my PC is stolen there's no opportunity for that to be attacked.

    So I think it's irresponsible to recommend any other approach.

    • Goatama · 946 days ago

      Or, you know, your hard drive crashes. Or are you storing that password store unencrypted in the cloud somewhere?

      I think it's irresponsible to suggest that someone keep their passwords in one and only one place. If that is compromised or lost, then someone is up the creek without a paddle.

      What we need is a solution that is cross platform (Windows, Mac, Linux, iOS, and Android), web accessible (or at least syncable across devices), and easily maintained.

      • Bob · 945 days ago

        Er ... no ... why would I do that?

        I have backups of the file, still encrypted, in more than one location, none of them in the cloud. Of course. I have had several hard disks crash and survived all of them. But this is computing 101. I would get a robust backup process in place on the same day I get a new PC out of the box if I were you.

  13. Here's the problem I've had with that video, and the similar instructions that were on the now-defunct Microsoft page on how to make a strong password: That approach works fine in, say, a situation where you only have to log in to something once a week, or once a month. If you have a need to log in to, say, 30 - 40 servers a day, or you have to unlock your workstation every time the screensaver goes on, trying to *remember* something that esoteric, let alone type it quickly and efficiently, is a daunting prospect.

    Better to have a sufficiently strong, easily remembered passphrase with some entropy in it. "Getthelittlegirl aUnicornPapoy!" (courtesy of Despicable Me) is going to be much stronger, but more importantly, easier to remember and type than trying to remember the first letter of each word in a paragraph, and one that has also been pseudo-1337ed.

  14. ascension2020 · 946 days ago

    The debates over how to choose a good password will go on and on for years. At the end of the day there's more than one way to do it.

    One thing I will say though is that is that the research in the paper seemed to be built around how secure passphrases are as part of a system when that system is being attacked, and when the attacker only needs to break 1% of the passphrases in order to compromise it (I'm basing that off this article, I haven't personally read the paper yet). If that is indeed the case then that's very different from how likely someone is to crack any one individual's password or passphrase.

    In other words, think of it like this. If I know an organization promotes the use of passphrases, and they have 10,000 users, and I've been hired to do a penetration test against it, then all I need to do is crack 1 password to penetrate the system. But if I have been hired to do a penetration test against a company and I have no idea what type of passwords they promote (I.e., what the minimum length is, whether they promote passphrases, whether they provide password vaults like Keepass and encourage people to use randomly generated passwords, etc) then my job just got exponentially harder.

    By the same token, if I'm attacking an individual user (hypothetically of course) and have no idea what type of password or passphrase they've chosen then life just got really, really hard.

    With that said, my method of choosing passprases is similar to Ted's comment. I like choosing 4 or 5 words then throwing some random punctuation into it. The phrase "80thvinyl" will be cracked in a very short time by someone with the right tools and with the knowledge that they're attacking a passphrase, then phrase "80'th=vin"yl" will not.

  15. TrudgingCitizen · 946 days ago

    Are longer passwords still better? Is having a long password of say 20 characters, comprised of plain dictionary words, better than an eight character password comprised of variety?

    Which is safer: H9*g4aw or moopapertractorcandy

    • ascension2020 · 946 days ago

      It's not necessarily about which one is better. I could work the math behind the choices you just gave but right now I don't have the time. Suffice it to say that both options you gave are going to take a very long time to crack.

      The other question that you have to consider, though, is which option is easier to remember? The discussions about passphrases aren't usually about what's right for you or me but about what's easier for a system as a whole. Most people find it much easier to remember passphrases. Personally I love randomly generated passwords that I store in a password manager, but I don't always use those. Sometimes I use passphrases (usually if it's a password I'll be typing a lot). For example, I love passphrases for laptop encryption. If I'm having to type it a few times a day then I'd much rather type father44todaynow!will than GHAJweroui&!*#%$0asd

  16. Freddo · 945 days ago

    One thing a lot of people forget about passphrases, is that they are susceptible to shoulder attacks. That is, someone looking over your shoulder can very easily crack the password if they can identify half (or maybe less) of your keystrokes.
    For example, below is a passphrase generated at random, with half of the characters hidden. I'll leave it to you to crack:
    a-s-n-e-h-e-r-s-a-

  17. Scott · 918 days ago

    I know, I know- a day late and a dollar short, but I have to put my two cents in. The reason why random dictionary words are not secure is because the lack of entropy. For a passphrase to be secure it must have maximum entropy. In security that means that there should be as much randomness as possible. Very little or no order - as much disorder as you can put in that sucker is best. The more entropy you put into the password the longer it will take to crack using brute force. A 12 character random password using case sensitive alpha numeric symbols gives you around 80 bits of entropy. That means it would take, on average, .5 * 2^80 tries. That's a lot of tries. A 32 character password will give you about 256 bits of entropy which is uncrackable by today's computers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

About the author

I've been writing about technology, careers, science and health since 1995. I rose to the lofty heights of Executive Editor for eWEEK, popped out with the 2008 crash, joined the freelancer economy, and am still writing for my beloved peeps at places like Sophos's Naked Security, CIO Mag, ComputerWorld, PC Mag, IT Expert Voice, Software Quality Connection, Time, and the US and British editions of HP's Input/Output. I respond to cash and spicy sites, so don't be shy.