I’m worried about password strength meters.
In March 2015 I tested five popular password strength meters in a simple experiment that was designed to show if they could actually spot weak passwords. They all failed.
It’s been almost eighteen months since my original test and during that time password cracking has moved on, authentication standards have moved on and password best practice has moved on.
I wondered if password strength meters had too.
There is a gap between what password strength meters tell us and what we need to know.
On the face of it, password strength meters seem like a great idea – when a user needs to create a password for a website, the meter can tell the user how strong their choice of password is and, most crucially of all, help them steer clear of really bad passwords.
The trouble is that most password strength meters don’t actually measure password strength at all.
A strong password is one that is highly resistant to attempts to crack it with online or offline dictionary attacks. The only good way to measure the strength of a password is to try and crack it – a serious and seriously time consuming business that requires specialist software and expensive hardware.
So instead of measuring the thing we really care about, password strength, most meters actually measure something that’s easy to figure out: password entropy.
A password with a lot of entropy should be hard to crack by brute force (guessing) but that’s a password cracker’s technique of last resort. Their first line of attack is likely to be based on dictionary words and rules that mimic the common tricks we use to di5gu!se th3m. Measuring entropy doesn’t tell us anything about that.
In both tests I used the same five terrible passwords, passwords that would fail a genuine cracking attempt instantly, and then ran them through five popular password strength meters.
The premise of the test is simple; password strength meters should dismiss all of the passwords out-of-hand so a failure to dismiss any password is a failure of the whole test.
Rejecting all of the passwords doesn’t prove that a password strength meter is good, but accepting one of the unsafe passwords shows that it’s not up to the job.
Five terrible passwords
The passwords I used in the test are all, deliberately, absolutely dreadful. They’re chosen from a list of the 10,000 most common passwords and have characteristics I thought the password strength meters might overrate:
- abc123 – number 14 on the list, first to mix letters and numbers
- trustno1 – number 29, second to mix letters and numbers
- ncc1701 – number 158, registration number of the USS Enterprise
- iloveyou! – number 8778, first with non-alphanumeric character
- primetime21 – number 8280, longest with letters and numbers
Being on the list of the 10,000 most common passwords is broadly synonymous with being one of the 10,000 worst passwords.
Even if a hard to crack password got on the list by accident it would instantly become a weak, easy to crack, password because it’s on the list. Password crackers seed their dictionary attacks with lists of common words and passwords they think people are likely to use. If your password is on that list, it’s toast.
To check my assumptions, I ran the five passwords through John the Ripper and cracked them on my laptop using its out-of-the-box settings. They all fall in well under a second.
The meters were chosen by googling ‘jQuery strength meter’ and picking the first five that came up. This is is the kind of thing a web developer would do if you asked them to add a password strength meter to your website.
Two of the five meters under test, the jQuery Password Strength Meter for Twitter Bootstrap and Strength.js were also in the first five results in 2015.
- jQuery Password Strength Meter for Twitter Bootstrap
- Mato Ilic’s PWStrength
- FormGet’s jQuery Password Strength Checker
- Paulund’s jQuery password strength demo
This year I added a ringer to my tests; zxcvbn. It’s a sophisticated, open source password strength meter used by Dropbox and WordPress that’s been rigorously tested.
I added it to the test so that it’s clear what a website password strength meter of proven quality does when faced with this test.
My table of results below uses the same colours and words (sometimes abbreviated but with misspellings faithfully reproduced) that the password strength meters use:
|abc123||Weak||Week||Very weak||Weak||Weak…||Very weak|
|trustno1||Normal||Week||Very weak||Good||Make it…||Very weak|
|ncc1701||Medium||Week||Very weak||Weak||Make it…||Very weak|
|primetime21||Medium||Medium||Weak||Good||Make it…||Very weak|
The result, sadly, is exactly the same as 2015. They all failed.
The ringer, zxcvbn, identified the five passwords as very weak but none of the first five password strength meters I plucked out of Google did.
Just as they did in 2015 the meters also muddy the waters with misleading or ambiguous terminology and colours – what is a medium or mediocre password?
If you’re a website user
- You can’t trust password strength meters on websites
- Watch our video on how to pick pick a proper password
If you’re a website operator
- If you want a password strength meter for your website don’t guess, use zxcvbn
- Use two-factor authentication so that hackers can’t get into your site with just a cracked password
- Reduce the danger of bad passwords by locking users out after a few failed login attempts
18 comments on “Why you STILL can’t trust password strength meters”
We all know about short and common passwords but how important is the way they are stored? At this point should salted and hashed be the standard? and is that secure enough. Also, most people don’t know how websites stores their passwords. And finally, no matter how good the password is and how well it’s stored, clever phishing, keylogger malware, or social engineering can still get the password.
Storage is immaterial for online attacks and critical for resistance to offline attacks. As a user you should choose passwords to resist offline attack and hope the website’s got your back with correct storage. See https://nakedsecurity.sophos.com/2014/10/24/do-we-really-need-strong-passwords/ for more on this.
Salting and hashing passwords isn’t secure enough by itself but it’s part of the solution – see https://nakedsecurity.sophos.com/2013/11/20/serious-security-how-to-store-your-users-passwords-safely/ for more on this.
2FA can make phishing, keylogging and social engineering much harder.
Anymore, something like the bCrypt standard is the best way to go, especially with it’s configurable stretching. If computing power get to the point where your stored passwords aren’t as safe, you can easily increase the number of stretching rounds. Set your site so that passwords get re-hashed on a successful login and the new hashes get propagated.
Even at that, though, it’s not a bad idea to take a final step and store your password database separate from your application database. Different database, different server. Preferably accessed via a web service also not hosted on your application server, in case you’re using a platform that forces you to store your connection string in a text file on the server (Looking at you there, Microsoft), that config file is elsewhere. That way if/when your application database gets compromised, your passwords aren’t available.
Bottom line is that there’s no way of truly securing a website- the HTTP protocol is working against you. HTTP was build to be open an unrestricted communication and every measure taken to change that is an imperfect bolt-on. The trick isn’t to make your website unbreakable. You can’t. The trick is to make your site more hassle than it’s worth.
Mark, Did you mean to spell Mediocre correctly in column 3, line 4 of the table, or to reproduce a typographic error there?
Spelling errors are reproduced faithfully!
I see “mediocre” spelled correctly in the table, but misspelled in the text below it, just above the section heading “Recommendations”.
and if someone makes a password strength meter just to harvest passwords for their library,,,, I’ll just trust my creativity… 5er!0us1y (no I don’t use anything that simple)
I don’t understand your use of the term “ringer” with respect to the zxcvbn strength meter; I’ve always understood a ringer to be a bogus entity, which zxcvbn certainly is not. And I’m further confused by your contention that you have proven the fallibility of password strength meters, when you show that zxcvbn, unlike the others, rejected all of your poorly chosen passwords. You even go on to include it in your recommendations!
Eh, what’s up, doc?
zxcvbn is a ‘ringer’, a bogus entry as you say, because it did not meet the entry requirements for the test – it did not come up in the top 5 results when I Googled ‘jQuery strength meter’. Also I knew it would pass.
The scenario in which people commonly encounter password strength meters, and in which I contend you can’t trust them is as follows:
You visit a website for the first time and create an account. The account sign-up page asks for a password and has an embedded password strength meter. You try out a few passwords until you find one that’s ‘strong’. Can you trust that assertion?
There is a slim chance that the site is using a well designed and rigorously tested password strength meter like zxcvbn but you won’t know if it is and there is a very good chance it’s not.
The web is awash with password strength meters, and tutorials on how to code password strength meters, that use entropy calculations (and policies that force you to use capitals, special characters etc that actually *reduce* the number of guesses an attacker has to make.)
If you’re a user, these are the strength meters you’ll encounter and if you’re a developer these are the strength meters that are easiest for you to find and integrate.
The existence of zxcvbn doesn’t make password strength meters trustworthy any more than the existence of vegetarian lions makes it safe to put your head in a randomly chosen lion’s mouth.
I understood “ringer” to mean a substitute that no one would suspect, but that was deliberately chosen to be better than the rest of the field. Like when you’re playing cricket in the Eighth League, comfortable but competitive in a fair matchup with other duffers like yourself, and one Saturday morning you find yourself opening the batting against a bowler who’s metres quicker that everyone else, and unrelentingly accurate, and can hit the seam every time, and your head whenever he wants. Later, after the match, when you’re counting your bruises and looking at the fixture list to see who you’re playing next week, you suddenly realise that the opposition’s Second League team suspiciously had the week off. That’s a ringer.
Ahh, gotcha. So it is only the title that is misleading: it implies that we can’t trust *any* password meters.
Thanks for the explanation!
“The only good way to measure the strength of a password is to try and crack it – a serious and seriously time consuming business that requires specialist software and expensive hardware.” I’m not sure this is true. What makes password cracking so difficult is all of the hashing, but a password strength meter would have access to the plaintext password. So it should be a lot easier to, say, look for a dictionary word in the password and consider it weaker; same with common substitutions (e>3, a>@, etc.).
Not having to crack hashes makes guessing passwords easier but there are other environmental factors that make it a challenge in this case.
Efficient cracking of password hashes is hard but it’s done with specialist software that knows how to massively parallelise calculations across multiple GPUs, running on dedicated hardware and in an environment where waiting for minutes might be considered fast.
That said, what you seem to be describing is something else – looking for giveaways within the password, such as the use of dictionary words, rather than actual cracking. Your approach is a good description of what zxcvbn does.
I think better advice for a website operator is “don’t store passwords”. Outsource your authentication to one of the big people through OAuth or the “Login with Facebook” protocol. (And let people choose multiple ones of those. What StackOverflow does should be your goal.)
Before creating support for user login, ask yourself “Is my website important enough in a user’s life for me to demand that the user create a username and password?”, and if you answer “yes”, stop, take a break, and come back after getting some coffee and take a hard look at the question again. If you’re creating user accounts to win some online raffle, you aren’t important enough. If you’re running a silly online casual game, you aren’t important enough. If you’re maintaining accounts so that people can comment on news articles or ask each other questions, you aren’t important enough. Your default assumption should be “I’m not important enough to store my users’ passwords”.
You can still identify users by handles they create – just because you’re using Facebook or Google or Twitter or whatever for authentication doesn’t mean that you can’t also have a user profile and certainly doesn’t obligate you to look at user’s “real names” or mine whatever marketing data comes along with a Facebook login. Just don’t store passwords.
If you object because part of your target market is too young (under 13) to have accounts at the big players, go consult an attorney because COPA compliance is no joke.
If you’re a bank handling actual money, maybe. Get smart about 2FA though, and read about the risks of SMS-based 2FA.
If you don’t store passwords, your password database can’t get stolen.
Use login with Facebook protocol? Perfect if you want Facebook collecting even more data on what you do on every site you visit logging through them.
I think a more relevant question to ask is: Do password meters lead to more secure passwords? If not – then do away with them – if yes: they are justified even if they have (obvious) flaws.
We have a very related paper on this topic presented at European Symposium on Research in Computer Security (ESORICS 2015) http://wangdingg.weebly.com/uploads/2/0/3/6/20366987/esorics15conf0827.pdf