Are Google and Facebook to block extremist content with automatic hashing?

Since 2008, the National Center for Missing & Exploited Children (NCMEC) has offered to share with ISPs a list of hash values that correspond to known child abuse images.

That list, which was eventually coupled with Microsoft’s own PhotoDNA technology, has enabled companies like Google, Microsoft, ISPs and others to check large volumes of files for matches without those companies themselves having to keep copies of offending images, and without human eyes having to invade users’ privacy by scanning their email accounts for known child abuse images.

Earlier this month, the Counter Extremism Project (CEP) unveiled a software tool that works in a similar fashion and urged the big internet companies to adopt it.

Instead of child abuse imagery, the version the group unveiled works to tag gruesome, violent content spread by radical jihadists to use as propaganda or for recruiting followers for attacks.

And instead of just focusing on images, the new, so-called “robust hashing” technology encompasses video and audio, as well.

It comes from Dartmouth University computer scientist Hany Farid, who also worked on the PhotoDNA system.

The algorithm works to identify extremist content on internet and social media platforms, including images, videos, and audio clips, with the aim of stopping the viral spread of content illustrating beheadings and killings.

Now, sources familiar with the process have told Reuters that YouTube and Facebook – two of the world’s largest destinations for watching videos online – have quietly started to adopt the technology to identify and remove extremist content.

While it’s been adopted for use in targeting child abuse imagery, the technology actually got its start in copyright takedown demands.

But whichever content it’s used to identify, the software works in a similar fashion: it looks for “hashes,” which are unique digital fingerprints assigned to content by online platforms. If a media file has already been identified as extremist, it can be quickly identified and removed wherever it’s posted.

This won’t stop new extremist media from being posted. The hashes can’t automatically detect that a video contains footage of a beheading, for example.

But once such a horrific video has been identified as extremist, it can be spotted and removed automatically, instead of having to go through the process of being reported, having humans vet and identify the material, and thereby having the time to spread virally.

Neither YouTube nor Facebook would confirm or deny to Reuters that they’re using hashes to remove known extremist media.

But why would they? Reuters quoted Matthew Prince, chief executive of content distribution company CloudFlare:

There’s no upside in these companies talking about it. Why would they brag about censorship?

As it is, President Obama, along with other US and European leaders, has increasingly voiced concern about online radicalization.

Two weeks ago, the president said that the Orlando mass shooting was “inspired” by violent extremist propaganda, and that “one of the biggest challenges” is to combat ISIL’s propaganda “and the perversions of Islam” generated on the internet.

According to Reuters’ sources, in late April, representatives from YouTube, Twitter, Facebook and CloudFlare held a call to discuss options including the CEP’s content-blocking system.

The sources said that the companies were wary of letting an outside group decide what defined unacceptable content.

Seamus Hughes, deputy director of George Washington University’s Program on Extremism, noted to Reuters that extremist content differs from child abuse imagery in that it exists on a spectrum, and different web companies draw the line in different places.

Besides not wanting to publicize that they’re automatically censoring content, whoever’s already using hashing to block extremist content have other good reasons not to talk about it. According to Reuters’ sources: they don’t want to clue in terrorists so that they can manipulate the system.

Nor do such companies want to be thrust into the position of having repressive regimes demand that the algorithms be used to censor their opponents.

The companies reportedly raised alternatives including establishing a new industry-controlled nonprofit or expanding an existing industry-controlled nonprofit, but all of the options involved hashing technology.