Microsoft scans email for child abuse images, leads to arrest‏

Microsoft scans email for child porn images

Image of email courtesy of ShutterstockIt’s not just Google.

Microsoft is also scanning for child-abuse images.

A recent tip-off from Microsoft to the National Center for Missing & Exploited Children (NCMEC) hotline led to the arrest on 31 July 2014 of a 20-year-old Pennsylvanian man in the US.

According to the affidavit of probable cause, posted on Smoking Gun, Tyler James Hoffman has been charged with receiving and sharing child-abuse images.

Hoffman admitted to police that he had acquired the abusive images through Kik Messenger, a chat app, in addition to “trading and receiving images of child pornography” on his mobile phone, according to the affidavit.

Microsoft detected two illicit images of a girl between 10 and 13 when they were uploaded to the man’s OneDrive cloud storage account.

An officer involved in the case, Trooper Christopher Hill from the Pennsylvania State Police, confirmed to the BBC that the affidavit was genuine and that Microsoft had instigated the investigation.

Hill said he couldn’t give specifics about the case, which he described as an “open investigation”.

But the trooper did tell the BBC that he was aware of other instances of “internet carriers” passing on similar details in other inquiries.

Google, of course, comes to mind.

Last week, a tip from Google similarly led to the arrest of a 41-year-old Texas man after Google discovered abusive images in his emails.

Like Microsoft, Google didn’t directly report the matter to police.

Rather, the company contacted the NCMEC’s CyberTipline, which serves as the US’s centralised reporting system for suspected child sexual exploitation.

The issue of privacy will of course arise in this case, as it does with any case involving scanning of email stored in the cloud.

The idea that a human looks at images in users’ email accounts, or that the government has fed Microsoft (or Google) a stack of child porn images and told the companies to run a search, gives rise to worries about privacy invasion.

But the reality is that the NCMEC maintains a list, available for ISPs if they so choose to use it (not all do, a spokeswoman for the NCMEC told me), of hash values that correspond to known child porn images.

This is nothing new; it’s been going on since 2008, when NCMEC first made the hash list available.

The list currently comprises about 20,000 hash values derived from images previously reported to NCMEC by ISPs.

By sharing the hashes, which act as digital fingerprints for what NCMEC calls “the worst of the worst” child pornography images, NCMEC enables ISPs and other participating companies to check large volumes of files for matches without those companies themselves having to keep copies of offending images.

For its part, Verizon found child-abuse images in its cloud in this manner in March 2013.

John Shenan, executive director of NCMEC’s exploited children division, told Ars Technica at the time of the Verizon case, the hash originally used to create unique file identifiers was MD5.

A few years ago, however, Microsoft donated its own PhotoDNA technology to the effort.

PhotoDNA creates a unique signature for an image by converting it to black and white, resizing it, and breaking it into a grid.

In each grid cell, the technology finds a histogram of intensity gradients or edges from which it derives its so-called DNA.

Images that share similar DNA signal that the images match, according to Microsoft’s marketing materials.

Given that the amount of data in the DNA is small, large data sets can be scanned quickly, which allows Microsoft to “find the needle in the haystack,” it says.

Optimally, this allows images to be recognised even if they’ve been resized or cropped.

Microsoft’s Digital Crimes Unit put out a statement in which it said that this application of PhotoDNA – detecting child abuse – is exactly why the company created the technology.

A spokesperson also said that Microsoft’s terms of service clearly state that automated technology will be used in this manner:

Child pornography violates the law as well as our terms of service, which makes clear that we use automated technologies to detect abusive behavior that may harm our customers or others. In 2009, we helped develop PhotoDNA, a technology to disrupt the spread of exploitative images of children, which we report to the National Center for Missing and Exploited Children as required by law.

Google also uses PhotoDNA to detect abusive images, the BBC reports.

At any rate, privacy advocates have no beef with automated scanning when it comes to uncovering child predation, as long as users are informed of it.

Emma Carr, acting director of the campaign group Big Brother Watch, had this to say to the BBC:

Microsoft must do all that it can to inform users about what proactive action it takes to monitor and analyse messages for illegal content, including details of what sorts of illegal activity may be targeted.

It is also important that all companies who monitor messages in this way are very clear about what procedures and safeguards are in place to ensure that people are not wrongly criminalised, for instance, when potentially illegal content is shared but has been done so legitimately in the context of reporting or research.

The recent successes of PhotoDNA in leading both Microsoft and Google to ferret out child predators is a tribute to Microsoft’s development efforts in coming up with a good tool in the fight against child abuse.

In this particular instance, given this particular use of hash identifiers, it sounds as though those innocent of this particular type of crime have nothing to fear from automated email scanning.