It’s not just Google.
Microsoft is also scanning for child-abuse images.
A recent tip-off from Microsoft to the National Center for Missing & Exploited Children (NCMEC) hotline led to the arrest on 31 July 2014 of a 20-year-old Pennsylvanian man in the US.
According to the affidavit of probable cause, posted on Smoking Gun, Tyler James Hoffman has been charged with receiving and sharing child-abuse images.
Hoffman admitted to police that he had acquired the abusive images through Kik Messenger, a chat app, in addition to “trading and receiving images of child pornography” on his mobile phone, according to the affidavit.
Microsoft detected two illicit images of a girl between 10 and 13 when they were uploaded to the man’s OneDrive cloud storage account.
An officer involved in the case, Trooper Christopher Hill from the Pennsylvania State Police, confirmed to the BBC that the affidavit was genuine and that Microsoft had instigated the investigation.
Hill said he couldn’t give specifics about the case, which he described as an “open investigation”.
But the trooper did tell the BBC that he was aware of other instances of “internet carriers” passing on similar details in other inquiries.
Google, of course, comes to mind.
Last week, a tip from Google similarly led to the arrest of a 41-year-old Texas man after Google discovered abusive images in his emails.
Like Microsoft, Google didn’t directly report the matter to police.
Rather, the company contacted the NCMEC’s CyberTipline, which serves as the US’s centralised reporting system for suspected child sexual exploitation.
The issue of privacy will of course arise in this case, as it does with any case involving scanning of email stored in the cloud.
The idea that a human looks at images in users’ email accounts, or that the government has fed Microsoft (or Google) a stack of child porn images and told the companies to run a search, gives rise to worries about privacy invasion.
But the reality is that the NCMEC maintains a list, available for ISPs if they so choose to use it (not all do, a spokeswoman for the NCMEC told me), of hash values that correspond to known child porn images.
This is nothing new; it’s been going on since 2008, when NCMEC first made the hash list available.
The list currently comprises about 20,000 hash values derived from images previously reported to NCMEC by ISPs.
By sharing the hashes, which act as digital fingerprints for what NCMEC calls “the worst of the worst” child pornography images, NCMEC enables ISPs and other participating companies to check large volumes of files for matches without those companies themselves having to keep copies of offending images.
For its part, Verizon found child-abuse images in its cloud in this manner in March 2013.
John Shenan, executive director of NCMEC’s exploited children division, told Ars Technica at the time of the Verizon case, the hash originally used to create unique file identifiers was MD5.
A few years ago, however, Microsoft donated its own PhotoDNA technology to the effort.
PhotoDNA creates a unique signature for an image by converting it to black and white, resizing it, and breaking it into a grid.
In each grid cell, the technology finds a histogram of intensity gradients or edges from which it derives its so-called DNA.
Images that share similar DNA signal that the images match, according to Microsoft’s marketing materials.
Given that the amount of data in the DNA is small, large data sets can be scanned quickly, which allows Microsoft to “find the needle in the haystack,” it says.
Optimally, this allows images to be recognised even if they’ve been resized or cropped.
Microsoft’s Digital Crimes Unit put out a statement in which it said that this application of PhotoDNA – detecting child abuse – is exactly why the company created the technology.
A spokesperson also said that Microsoft’s terms of service clearly state that automated technology will be used in this manner:
Child pornography violates the law as well as our terms of service, which makes clear that we use automated technologies to detect abusive behavior that may harm our customers or others. In 2009, we helped develop PhotoDNA, a technology to disrupt the spread of exploitative images of children, which we report to the National Center for Missing and Exploited Children as required by law.
Google also uses PhotoDNA to detect abusive images, the BBC reports.
At any rate, privacy advocates have no beef with automated scanning when it comes to uncovering child predation, as long as users are informed of it.
Emma Carr, acting director of the campaign group Big Brother Watch, had this to say to the BBC:
Microsoft must do all that it can to inform users about what proactive action it takes to monitor and analyse messages for illegal content, including details of what sorts of illegal activity may be targeted.
It is also important that all companies who monitor messages in this way are very clear about what procedures and safeguards are in place to ensure that people are not wrongly criminalised, for instance, when potentially illegal content is shared but has been done so legitimately in the context of reporting or research.
The recent successes of PhotoDNA in leading both Microsoft and Google to ferret out child predators is a tribute to Microsoft’s development efforts in coming up with a good tool in the fight against child abuse.
In this particular instance, given this particular use of hash identifiers, it sounds as though those innocent of this particular type of crime have nothing to fear from automated email scanning.
So, not that I am defending the porn, but it is the principal of the privacy. It may not have been “human eyes” that looked at the email, or the pictures, but Microsoft DID “look” at this email. So what else are they looking at, or for, for that matter. And what will they report to the government “tip line” think (see something say something) here from the DHS to send the brown shirts calling to your door? The whole spying thing with the NSA and now these Junior G men wanna be’s is really getting ridiculous.
Any argument against this technology on privacy grounds is completely hypocritical when you consider that the scanning technology isn’t all that different from anti-virus. You allow Microsoft & Google to scan your email for viruses, so what’s wrong with scanning for child porn? The only real technological difference is the back end database.
What does a computer virus look like… have you ever seen one up close with your own eyes?
It was reported (here) last week that Google is doing the same thing… and people are screaming about the NSA and what they know about us…
Creating and comparing signatures of something is entirely different to examining the contents of a mail. Without further investigation, the only thing Microsoft can say about the email is that the algorithm detected a known signature.
It’s equivalent to the Royal Mail/Customs using sniffer dogs to detect drugs in packages. They’re not actually examining the contents of the package, but the dog has indicated there is drugs and this warrants further investigation.
So… you’re really trying to make us believe that a human was arrested because “an algorithm said…”, and that no real live person confirmed that? At least with a dog if that package is pulled out a person is required to inspect it and the dog is a living breathing “thing” as opposed to some piece of code written by someone who could have made a mistake because they were pressed for time or some other reason… isn’t that the top reason an application is “patched”?
As I pointed out in the original post:
“Without further investigation, the only thing Microsoft can say about the email is that the algorithm detected a known signature.”
Of course a person (probably the police) eventually looked at the email in question before serving a warrant on the person in question.
With PhotoDNA, you cant see what you are searching for and if no humans look at the findings (and why should they), it would be fairly easy for a government to add other images to look for. Eg. if they know that someone has been leaking secret documents that contains images and would like to known who the sender is + recipients.
Who knows; Maybe the next Snowden will be caugt by somthing like this.
This list is maintained by NCMEC, not by any Government. Also, seems very likely that someone would look at the image, as I’m fairly certain a PhotoDNA signature detection wouldn’t hold up in court by itself.
Seems like the NSA could have (if not now) been doing this themselves anyway, what with inside access to both Google and Microsoft (via various exploits).
A person MUST look at the image to ‘be sure’ of the allegations. I find it hard to believe that a algorithm is concise enough to identify a single photo, even if it’s been manipulated? So are they doing it frame by frame on data if it’s a movie? I would insist that I see the photo before I’d even make a comment, by this time you should have your own attorney.
If you look at child abuse and how we handle it, you can see many problems. For instance, from what I’ve been advised, is that it is a learned process, usually by a relative. If you check out how the law operates, if someone does it, they end up in prison and stamped a ‘molester’ for life. In some instances, just saying they have the urges have left them in prison. Sounds kind of like drug users. If they wish to stop this, they must get the people that are likely to abuse someone and make some mechanism for them to say they have this problem and correct it and not be put in prison. Prison doesn’t teach anything except to make better criminals.
To bad I’m not a wizard and could come up with an answer. I know they way they are doing it now isn’t working, the same as the drug war, and this impacts our real children all the time.
Arrested August 31? Will the perp have time to escape?
Er, 31 July 2014.
Fixed…thanks.
oops! Thanks for the catch.