LinkedIn has confirmed that some of the password hashes that were posted online do match users of its service. They have also stated that passwords that are reset will now be stored in salted hashed format.
What is a salt? It is a string that is added to your password before it is cryptographically hashed. What does this accomplish? It means that password lists cannot be pre-computed based on dictionary attacks or similar techniques.
This is an important factor in slowing down people trying to brute force passwords. It buys time and unfortunately the hashes published from LinkedIn did not contain a salt.
After removing duplicate hashes, SophosLabs has determined there are 5.8 million unique password hashes in the dump, of which 3.5 million have already been brute forced. That means over 60% of the stolen hashes are now publicly known.
We also did some additional testing of commonly used passwords that should never be used. We started with the list of passwords that the Conficker worm used to spread through Windows networks.
All but two of the Conficker passwords were used by someone in the 6.5 million user password dump. The two passwords that weren’t found were ‘mypc123’ and ‘ihavenopass’.
Other passwords that we found in the dump include ‘linkedin’, ‘linkedinpassword’, ‘p455w0rd’ and ‘redsox’. We even found passwords that suggest people should know better like ‘sophos’, ‘mcafee’, ‘symantec’, ‘kaspersky’, ‘microsoft’ and ‘f-secure’.
We will continue to keep Naked Security readers up to date with what is known as we learn more.
It is critical that LinkedIn investigate this to determine if email addresses and other information was also taken by the thieves which could put the victims at additional risk from this attack.
Special thanks to Beth Jones and Richard Wang from SophosLabs for their hard work and assistance with this post.
21 comments on “LinkedIn confirms hack, over 60% of stolen passwords already cracked”
Could you explain how the passwords are retrieved if the salt string is random?
If the web site (admins) could retrieve the password, that would be an absurdly bad security design in the first place. They only need to be able to check a specific password – and a salted, one-way encrypted password serves just fine for that. (It has been the absolute minimal standard for storing passwords since, oh, at least the early 70’ies, by the way.)
You must store the salts or use some other piece of information that is static. While some sites use a account # string or other value there are various pros and cons to each approach. Even if the salt is stolen, it still prevents precomputation (eg. rainbow tables) and forces the attacker to figure out what salt is used and then individually brute force each hash.
So the salt must be known in order to make the login procedure work.
Is the salt also encrypted? Can you salt the salt? lol
You need to be able to gain a cleartext version of the salt, to re-check the password.
On registration you store (using md5 as an example)
When the client logs in you are checking the password they have submitted and the salt you have in cleartext.
You could use a hash’d salt, which would add an extra 32 (in md5’s case) characters to the password.
There is no need to encrypt the salt. You can store it as is in the database without having to worry about the hash being "cracked."
Think of it this way: The only way to "crack a hash" is to figure out which combination of characters are hashed to the same value as that of the extracted hashes. With hashing algorithms intentionally being computationally expensive, it would take too much effort for an attacker to try and generate the hash of each possible password every time they receive a new hash to crack.
With this being the case, the attackers create precomputed lists of hashes, or "Rainbow Tables." This considerably drops the amount of time it takes to compare hashes, since they don't have to be generated.
However, these tables get very large very fast. Therefore attackers make Rainbow Tables based on the hashes of normal words, or dictionaries. Now consider this scenerio: The attacker only knows the hash and the salt. There is no way to remove the changes the salt made from the hash (keep in mind, hashes are not reversible in any way, shape, or form). This is why keeping the salt unencrypted is fine – because there is no way to take its changes "out of the hash." Since the salt is used to make the password not included in any standard dictionary, this essentially renders the Rainbow Tables useless. I hope this clears that part up for you.
Now, it's also important to use random salts for each account. This is because if the salt is different every time, the attacker only has to generate one new list of hashes to try and crack each extracted hash, which is fairly time-consuming, but doable with persistence.
Instead, if you use a random salt each time, the attacker would have to create a new list of hashes using each salt, which would take an incredibly long time.
Hope this helps.
Good explanation on the use of salt.
One correction. Hash functions are not meant to be computationally expensive, they are designed to be as computationally cheap as possible while remaining secure.
For example, check out the finalists in the NIST hash function competition, which is to find the hash function that will become SHA-3. Each entrant has a web page describing the function they have invented, and the advantages they claim for it. They all claim both security and speed.
The ball park figure appears to be arround 6 clock cycles per byte on a modern CPU, so a normal password can be hashed in arround 100 nanoseconds. They also claim that their designs can be implemented in hardware using comparatively few gates. The current Intel Sandybridge CPUs have hardware accelation for SHA-1 and it is likey that whichever hash wins the competion will be hardware accelarated in future mainstream CPUs.
It doesn't actually matter, as an attacker would have to generate an entirely different rainbow-table for each individual salt. If you assume that LinkedIn did this, 60 million users and it took a hacker say 3 months to attack an individual user, times that by 60 million is nearly 15 million years to attack them all!
Whats your view on such attacks and how can they be controlled/avoided?
If you read this site, or any infosec material, you probably already know the answer:
Harden your services – nothing should be open that isn’t used
Validate all user input in your applications, and then sanitise it before it goes to back-end interpreters
Only ever store credentials in a salted, hashed format
Encrypt all sensitive data, and don’t be tempted to use symmetric encryption for performance.
After all, if you’re storing card numbers, it would not be hard to generate rainbow tables for all possible card numbers of a given length, and without letters I imagine quite small tables and a rather fast search (note I have not tested this).
One fortunate thing is that LinkedIn is one of the few companies that is an early adopter of the DMARC (dmarc.org) specification. The aim of that initiative is to allow email receivers (AOL, Yahoo, Gmail etc.) to confidently drop spoofed email that was not authenticated and sent by the real company (LinkedIn).
It's an awesome mechanism for truly combatting phishing but of course it only gets stronger as more email receivers and more companies register and adopt the system.
Obviously none of that helps the LinkedIn users that used their same passwords with other services…
Encrypting the salt is of little value. In order for hashes to work, you need to have the same value everytime. Without salt, it is your password. When you type in your password it hashes it and sees if it gets the same value. With salt it hashes your password along with a random value that is specific to that machine. The problem with salt is that the value must be stored somewhere. If the attacker has access to your hashes then they can probably access the salt as well. Even if you try to "encrypt" the salt, then you need that key stored somewhere so endlessly encrypting it does you no good. What salting does do is prevent someone from easily pre-calculating all the hashes and storing them in a database. Without salt they can easily have a database the contains all the common passwords such as "password", "password1234", etc and from that they just need to simply lookup in the database and match the hashes to the value. Salt makes pre-computing these values impossible. When the attacker breaks into the system the hash is now comprised of some long salt value and your password therefore building out a database with this information is impractical. Salting just buys you time and nothing more, it doesn't prevent anyone from getting your password from the hash especially if the password is something of decent significantly.
If the a global hash is stored in code and not in the database there are many scenario's where an attacker might have access to the database but no access to the hash.
Judging from several of the passwords (root, admin, administrator, etc…), there are many people who likely over-estimate their vocational qualifications. Imagine an individual on Linkedin whose vocation is IT Security Manager and his account is hacked because the password he used was ‘rootroot’.
The salt isn't encrypted. Even if the salt and the hash are stolen, it's not a real security risk. The purpose of the salt is to prevent the usage of Rainbow Tables (basically a dictionary/brute force attack that pre-calculated). If a salt is used any pre-calculated hashes are useless, and each password needs to be bruteforced with it's associated salt value (this can take weeks, months, years).
I've not seen anyone yet offer any insight into why only (!) 6.5 million hashes have been published – given that LinkedIn has over 120 million users, is there any evidence that this is all that was exfiltrated, or that the entire database was compromised but we're only seeing a released sample?
LinkedIn themselves seem to be saying that only SOME passwords were compromised, referring to "members whose passwords have not been compromised", but it's not clear whether they're inferring from what's published, or corroborating by internal audit.
There are a few tools (notably http://leakedin.org/) which purport to check if a password/hash is in the released list; if these do what they say on the tin, and there's any confidence that the leak is limited, then only about 5% of LinkedIn users actually need to change their passwords.
Most people will of course say "better safe than sorry". Honeypot account, anyone?
If I were a criminal sitting on 120 million passwords and I wanted to monetize them I would not start by releasing all of them publicly.
But I might release just enough to get the attention of the media and of the compromised site itself. That way, I can only assume that when the publicly released sample is proven to be valid hashes (as LinkedIn has now said is the case), the underground market value of the remaining 114 million is probably assured.
One could argue that once the thing goes public, users will change passwords and the undisclosed hashes will lose value fast. But we all know that the typical user doesn't think about changing passwords anything like as much as those of us in the security game. And with over 100 million the bad guys definitely have numbers on their side.
My guess is that this will turn into a big phishing expedition with other social engineering spin-offs, attacks on other providers where duplicate passwords were found and so on, hence my DMARC comment a few posts up.
Obviously I have no clue how many other hashes the hackers are in possession of, but it would make no sense to me if they disclosed their entire haul.
Oh, absolutely, a total compromise has to be assumed – but then we see LinkedIn suggesting that not all accounts are affected. Whether they actually know the extent of the breach themselves is unclear. Further, as Chris Shiflett points out (http://shiflett.org/blog/2012/jun/leakedin#comment-27) they appear not yet to know how the attack was perpetrated, so "we should assume that our new LinkedIn passwords are also compromised."
Interestingly, LinkedIn state in their blog that they had transitioned to hash salting shortly before this news broke, and that users whose passwords have not been compromised will benefit from that. I'm slightly skeptical of this – they can't create a new hash by adding salt to the original password because they don't have that in plaintext (one would hope!). They could perhaps have implemented a somewhat redundant double hash as SHA1(salt + SHA1(pw)). Whatever the change is, it might improve the situation if you change your password now – or there could conceivably be some trojan code intercepting new passwords as they are changed…
They're not selling themselves well here – http://blog.linkedin.com/2012/06/09/an-update-on-…
"We have built a world-class security team here at LinkedIn including experts such as Ganesh Krishnan, formerly vice president and chief information security officer at Yahoo!, who joined us in 2010. […] Under this team’s leadership, one of our major initiatives was the transition […] to a system that both hashed and salted the passwords"
Are they really saying it took over 18 months to retrofit salted password storage?
Hat I am missing here is two things. The first is that on the user side what protects you most is length. If your password is short it’ll be pwned in no time. Literally. I can’t write the equations from my mobile right now but dictionary is not that important as length is.
Second I am missing or oversee demands that services should need to disclose their password storage methods as part of their agreements with the user so I can see how secure or not this service is.
This site is talking about this intelligently, but at computerworld.com you were quoted saying:
>'Chester Wisniewski, senior security advisor at Sophos called Silveira's comments about salting somewhat confusing. "They are saying that their current production database is now salted, which seems to be technically impossible. They either lost the database some time ago and have been adding salts as users log in, which means not all of them are salted, or they have plaintext copies of the passwords, which defeats the purpose of hashing them to begin with.
Its actually quite trivial to add a salt to a list of unsalted hashes after the fact by just modifying your hashing procedure. That is you have an old column in your database `oldhash = sha1(pw)`, where sha1 is your hashing function, pw is each users password. Then, you generate a random salt for each user and start storing newhash = sha1(salt++oldhash) and verify it against sha1(salt++sha1(pw)). Sure you can't get sha1(salt++pw) as your hash function (as you don't know pw), but as key-strengthening is a good thing anyways (even though adding just one round of hashing does effectively nothing).
See discussion http://security.stackexchange.com/a/15835/2568