Data on 123 million US households exposed


What surprising things might a keen data hunter find sitting in an unsecured state on a cloud service these days?

For a researcher at UpGuard, on 6 October the answer turned out to be an intriguing 36GB database file sitting in plain view on an Amazon Simple Storage Service (S3) bucket uploaded by analytics company Alteryx.

Leaky bucket might be a better description because when opened the database revealed the personal financial data of 123m American households – in effect everyone with an address in the US around the time of the file’s creation in 2013.

Let’s digest this: regardless of whether you’ve heard of Alteryx or not (and few will), if you’re a US householder, a humungous trove of your personal data was inside this easily-accessible file.

And quite a cache it was too, comprising 123m rows, each with 248 columns, culled from the US Census Bureau bulked with a “massive” amount from credit-reporting company Experian.

What data? It’d be easier to say what wasn’t in the database in fact. UpGuard quotes Experian’s marketing blurb used to sell the data to third parties such as Alteryx:

With thousands of attributes on more than 300 million consumers and 126 million households, ConsumerView data provides a deeper understanding of your customers, resulting in more actionable insights across channels…

No wonder Alteryx wanted it. In case anyone assumes the data was anonymised, UpGuard reckons:

While the spreadsheet uses anonymized record IDs to identify households, the other information in the fields – as well as another spreadsheet in the bucket – are sufficiently detailed as to be not merely often identifying, but with a high degree of specificity.

In addition to trifles such as address, telephone number and estimated income, this included home valuations, when householders last bought a car, what magazines they subscribe to, how much they like to travel, their cat ownership – you name it.

Experian clearly knows an awful lot about Americans and has been trading it around partners to use, one of which didn’t secure it well, or at all.

All UpGuard needed to access the data was a free Amazon Web Services (AWS) account anyone could open, which marks this incident as the sort of screw up security people will be quoting as a cautionary tale in conference presentations for years to come.

Had the data been noticed by criminals rather than a researcher, the latest incident could easily have ranked as a major breach similar to the one that affected Experian’s rival Equifax in September.

Experian’s odd reaction has been to pass the buck, telling Forbes:

This is an Alteryx issue, and does not involve any Experian systems.

Technically correct but disingenuous. Surely any company handing over large amounts of sensitive data on every household in the US knows it is a loaded weapon in the wrong hands and has a duty to set some standards as to how it will be secured.

As with previous incidents, the leak is another reminder about the mysterious lack of data protection rules in the US. In my opinion, the system leans too lazily on bad publicity to curb weak security when what is needed is independent intervention.