Email list-cleaning site may have leaked up to 2 billion records

The number of records exposed online by an email list-cleaning service in February may be far higher than originally anticipated, according to experts. The number of records available for anyone to download in plaintext from a breach at may have been closer to two billion.

Security researcher Bob Diachenko, who found the exposed data and worked on the breach investigation with research partner Vinny Troia, originally explained that on 25 February 2019, he discovered a 150Gb MongoDB instance online that was not password protected.

There were four separate collections in the database. The largest one contained 150Gb of data and 808.5 million records, he said in his blog post on the discovery. This included 798 million records that contained users’ email, date of birth, gender, phone number, address and Zip code, along with their IP address.

He then did some due diligence:

As part of the verification process I cross-checked a random selection of records with Troy Hunt’s HaveIBeenPwned database. Based on the results, I came to conclusion that this is not just another ‘Collection’ of previously leaked sources but a completely unique set of data.

Exposed MongoDB instances don’t always clearly indicate who uploaded them, but Diachenko’s research turned up a likely suspect: This company, which has now taken down its website, offered what it called enterprise email validation services, along with free phone number lookup.

The service enabled mass emailers to clean their email lists, removing what it called ‘hard bounces’. This enables those with large email lists to verify which ones are real. It also included services that removed:

Spamtraps or possible threats in your email list such as role accounts, botclickers, honeypots, and litigators.

Diachenko emailed the company and received a response which said:

We appreciate you reaching out and informing us. We were able to quickly secure the database. Goes to show, even with 12 years of experience you can’t let your guard down.

After closer inspection, it appears that the database used for appends was briefly exposed. This is our company database built with public information, not client data.

This week, cybersecurity company Dynarisk said that it had analysed the other three data collections and found far more records than Diachenko reported. It puts the data volume at 196Gb, and claims that there were two billion records there.

The company told The Register that the other collections were named Verified Emails, PyEmail, and EmailScrub. The latter contained the most extra data, at 6.3Gb. However, it wasn’t clear what specific information was in these collections.

Various press outlets are carrying both the 800 million and two billion record figures, but Troia has gone public on Twitter disputing Dynarisk’s claim, arguing that the original figure is the accurate one:

Whether 800 million or two billion, the risk to the users involved is significant, Dynarisk said:

The lists can be used to target the people on it with phishing emails and scams, telephone push payment fraud, and the data contains enough information to enable tailored scams aimed at key staff who could be targeted for CEO fraud or Business Email Compromise.

Australian security researcher Troy Hunt has uploaded the records that we know about for sure to HaveIBeenPwned, his site that documents email addresses compromised in security breaches. Roughly a third of the email addresses were new to his database, the service said on Twitter:

The latest upload also appears to have earned the site a depressing new record:

Have you been pwned?

What can you do if your email address shows up among the compromised addresses (or indeed any others) on HaveIBeenPwned?

The usual measures apply:

  • Immediately change any passwords common to multiple services, ensuring that each password is both unique and strong, and therefore very difficult to guess. How to pick a strong password.
  • Change any other passwords you’re using that would be easy to guess (that includes dictionary words, obvious combinations of numbers and deliberate misspellings).
  • Use a password manager to keep track of these unique passwords. Why you should use a password manager. 
  • Turn on two-factor authentication (2FA or MFA) for your most sensitive accounts. What is 2FA and why you should care.