Facebook bug could have exposed your phone number to marketers

You know that Facebook data-use policy, the one that promises it’s not going to spread our personal information to outfits that want to slice and dice and analyze us into chop suey and market us into tomato paste?

We do not share information that personally identifies you (personally identifiable information is information like name or email address that can by itself be used to contact you or identifies who you are) with advertising, measurement or analytics partners unless you give us permission.

Yea, well… funny thing about that…

Turns out that up until a few weeks ago, against its own policy, Facebook’s self-service ad-targeting tools could have squeezed users’ cellphone numbers from their email addresses… albeit very, verrrrry sloooowly. The same bug could have also been used to collect phone numbers for Facebook users who visited a particular webpage.

Finding the bug earned a group of researchers from the US, France and Germany a bug bounty of $5000. They reported the problem at the end of May, and Facebook sewed up the hole on 22 December.

That means that phone numbers could have been accessed for at least seven months, although Facebook says that there’s no evidence that it happened.

The researchers described in a paper how they used one of Facebook’s self-serve ad-targeting tools called Custom Audiences to ascertain people’s phone numbers.

That tool lets advertisers upload lists of customer data, such as email addresses and phone numbers. It takes about 30 minutes for the tool to compare an advertiser’s uploaded customer list to Facebook’s user data, and then presto: the advertisers can target-market Facebook users whose personal data they already have.

Custom Audiences also throws in other useful information: it tells advertisers how many of its users will see an ad targeted to a given list, and in the case of multiple targeted-ad lists, it tells advertisers how much the lists overlap.

And that’s where the bug lies. Until Facebook fixed it last month, the data on audience size and overlap could be exploited to reveal data about Facebook users that was never meant to be offered up. The hole has to do with how Facebook rounded up the figures to obscure exactly how many users were in various audiences.

As far as resources go, the initial exploitation is the most “expensive” aspect of the exploit, the researchers said. In one evaluation of the attack, they recruited 22 volunteers with Facebook accounts who lived either in Boston or in France.

It took 30 minutes to upload two area code lists for Boston (617 and 857) where the phones had 7 digits to infer. Each list had one million phone numbers, all with a single digit in common. France was even tougher to chew through: it took a week to generate 200 million possible phone numbers starting with 6 or 7 and to upload each list.

But after that, it was fairly smooth sailing.

The resulting audiences can be re-used to infer the phone number of any user.

The researchers went on to use Facebook’s tools to repeatedly compare those audience lists against others generated using the targets’ emails. They kept an eye out for changes to the estimated audience figures that occurred when an email address matched a phone number, revealing users’ numbers drip by drip, one digit at a time.

The attack apparently worked with all Facebook users who had a phone number associated with their account. The exploit stumbled when people provided multiple, or no, phone numbers for their Facebook accounts. It took under 20 minutes per user to get phone numbers.

The researchers used the same technique to collect phone numbers en masse for volunteers who visited a website with the “tracking pixel” Facebook provides to help site operators target ads to visitors. As they explain, Facebook gives advertisers some code – referred to as a tracking pixel, since it was historically implemented as a one-pixel image – to include on their websites. When users visit the advertiser’s website, the code makes requests to Facebook, thereby adding the user to an audience.

The audiences aren’t defined by “attributes,” such as visitors’ gender or their location. Rather, these are “PII-based audiences.” Advertisers select specific users they want to target, by either uploading known email addresses, names, or other personally identifying information (PII), or by selecting users who visited an external website that’s under the advertiser’s control.

The tracking-pixel version of the exploit succeeded in getting the researchers the phone numbers they were after. It appeared to work for all accounts Facebook defines as daily active users.

Facebook fixed the bug by weakening its ad-targeting tools. They’re not showing audience sizes any longer when customer data is used to make new ad-targeting lists.

Facebook Vice President for Ads Rob Goldman put out a thank-you statement for the researchers’ find:

We’re grateful to the researcher who brought this to our attention through our bug bounty program. While we haven’t seen any abuse of this complex technique, we’ve made product changes to prevent this from occurring.