Deep neural networks (DNNs): meant to mimic the brain’s hidden layers of interconnected neurons, they’re complex machine learning systems that can learn tasks on their own by analyzing vast amounts of large, unstructured data such as digital images, sound, or text in order to make predictions.
Think DNNs that are better than humans at lip-reading because of the way they can not only interpret “spatiotemporal” changes in the mouth’s shape as a human speaks but which can also make predictions based on the entire sentence being spoken, so as to use sentence context to improve guesses.
Another development relying on DNN is Google’s ability to turn heavily pixellated images back into recognizable faces: just one thing Google’s doing with AI and its enormous ocean of images.
DNNs are being developed to diagnose lung cancer, to detect early signs of diabetic blindness in eye scans, to beat your doctor at predicting whether you’ll have a heart attack, and to replicate voices, including that of current or former presidents, with the only training needed being a snippet of real, live, recorded voice.
And now, the latest: researchers are claiming that DNNs have beaten humans at figuring out whether somebody’s gay or straight.
They fed 35,326 facial images to a DNN and say that it beat humans hands-down when it comes to sussing out sexual orientation. Given one photo each of a straight man and a gay man, the model they used distinguished between them correctly 81% of the time, the researchers claim. When shown five photos of each man, they say it correctly nailed their sexuality 91% of the time.
The model wasn’t so hot with women: it accurately differentiated between gay and straight with 71% purported accuracy after looking at one photo, and 83% accuracy after five.
Either way, with one or five images, it purportedly did far better than humans. Using the same images, people could tell gay from straight only 61% of the time for men and 54% of the time for women. In fact, the finding supports research that suggests humans can determine sexuality from faces only slightly better than pure chance.
As the researchers tell it – they’re Yilun Wang and Michal Kosinski, of Stanford University’s Graduate School of Business – the features they extracted from the facial images include both fixed (such as nose shape) and transient characteristics (such as grooming style: beards, moustaches, sideburns etc). They described their findings in a paper due to be published soon in the Journal of Personality and Social Psychology.
Wang and Kosinski say that they know whether the faces belonged to gay or straight people because they pulled the images off a dating site: a place where sexual orientation is accessible as part of everybody’s profile. Note bene, however: their work is controversial, and it’s been questioned.
The researchers point to what’s known as the prenatal hormone theory (PHT) of sexual orientation, which predicts the existence of links between facial appearance and sexual orientation. The theory holds that same-gender sexual orientation stems from the underexposure of male fetuses or the overexposure of female fetuses to androgens that are responsible for sexual differentiation – such as the differences between men and women’s faces. The PHT predicts that gay people will have faces that aren’t typical for their gender: gay men will have smaller jaws and chins, slimmer eyebrows, longer noses and larger foreheads, the theory goes, while gay women will have the opposite facial characteristics.
The same hormones that influence facial structure are suspected to also influence sexuality, the theory goes. Like all DNNs that can pick up on subtle clues, the researchers suggest their system is picking up on subtle facial clues to point to a supposedly correlating sexuality. They found that their program focuses most of its attention to the nose, eyes, eyebrows, cheeks, hairline and chin to determine male sexuality, while it zeroed in on the nose, mouth corners, hair and neckline for women. (Note that there’s also been research that finds no such correlation between facial structure and sexuality, however.)
The researchers admit that the study has its limitations. For one thing, images coming from a dating site might be more revealing of sexuality than most photos. They tried to address that issue by training their model to focus on non-transient facial features, such as nose shape. Of course, there are also possible issues around self-reported sexuality: maybe some users call themselves straight but are actually bisexual or gay, and vice versa. The possibility that users haven’t self-identified their sexuality accurately apparently hasn’t been incorporated into the study: Kosinski writes in the report that the researchers didn’t see much incentive for people to advertise themselves as something they’re not on a dating site.
What’s the point of all this? They’re not trying to out anybody, the researchers say. Rather, with all of our images being amassed on Facebook, LinkedIn, and Google Plus profile pictures, to name a few, it’s good to know, from a privacy standpoint, what can be gleaned from them.
Such images are public by default, accessible to one and all. Given all this easily accessible public data, and given the progress of machine learning tools, accurate classifiers could – in theory – be built that spot our sexuality.
It’s feasible that it could be done without subjects’ consent or knowledge, they say. It’s feasible that it could be used as a weapon in cultures that stigmatize LGBT people.
They didn’t build a tool to invade people’s privacy, the researchers said. Rather, they used what they say are widely available, off-the-shelf tools, publicly available data, and methods known to those well-versed in computer vision.
Given that companies and governments are increasingly using computer vision algorithms to detect people’s intimate traits, our findings expose a threat to the privacy and safety of gay men and women.
We did not create a privacy-invading tool, but rather showed that basic and widely used methods pose serious privacy threats. We hope that our findings will inform the public and policymakers, and inspire them to design technologies and write policies that reduce the risks faced by homosexual communities across the world.
Some still see major ethical and science-based issues with the project. Sarah Jamie Lewis, a cybersecurity researcher who studies privacy, called the paper simplistic and naïve:
1) The paper presents a simplistic, naive narrative of queerness. Bisexual people exist, queerness isn’t binary, gender isn’t binary.
— Sarah Jamie Lewis (@SarahJamieLewis) September 8, 2017
That, among other problems regarding invasion of privacy…
If you are a researcher and you are scraping profiles from the internet with the intent of using that data to build intrusive tech…stop.
— Sarah Jamie Lewis (@SarahJamieLewis) September 8, 2017
Unfortunately, researchers all too often feel emboldened to scrape public profile data to do with as they like. We saw that in January, when the people behind Pornstar.ID—a reverse-image lookup for identifying porn actors — scraped 650,000 adult film actors’ images in order to tune their neural network.
I wrote that up in January, and I still haven’t heard back regarding if those performers consented to being identified and listed on the Pornstar.ID site, nor if they agreed to having their biometrics scanned so as to train a neural network.
What’s the difference between Pornstar.ID and the grab for images made by Kosinski and Wang?
Is there any law that says people’s published images — be they porn stars or those on a dating site, both of which are presumably published online for all to see (or purchase) — aren’t up for grabs for the purpose of training facial recognition deep learning algorithms?
As a matter of fact, there are such laws concerning face recognition. The Electronic Privacy Information Center (EPIC) considers the strongest of them to be the Illinois Biometric Information Privacy Act, which prohibits the use of biometric recognition technologies without consent.
Indeed, much of the world has banned face recognition software, EPIC points out. In one instance, under pressure from Ireland’s data protection commissioner, Facebook disabled facial recognition in Europe: recognition it was doing without user consent.
So yes, depending on where you live, there are laws against facial recognition without consent. It’s not clear whether Pornstar.ID’s use of facial scanning falls foul of these laws; ditto for the work done on DNNs and the images used to see if they can detect sexual identity.
Note that, as pointed out by the Economist, this isn’t the first controversial research we’ve seen from Kosinski.
He also developed a method to analyze people in minute detail based on their Facebook activity: called psychometric profiling, it’s generated its share of criticism. It’s also drawn the threat of a lawsuit from Facebook and a job offer from Facebook, on the same day.
Remember the firm Cambridge Analytica? The big-data crunching, ad-buying firm that gained notoriety for what its execs call psychological warfare in both the Trump and Brexit campaigns?
Kosinski claimed he had nothing to do with the firm. But try explaining that to all the friends and acquaintances who wrote to him, telling him to “look at what you’ve done”.
Big data: is it best to keep it to yourself when you find new ways it can be used to manipulate people or invade their privacy? Is it wise, or safe, or altruistic, to point out what this stuff can be used to do, if nobody else has (to our knowledge) figured it out on their own?
Maybe the genie’s out of the bottle now. Maybe repressive, anti-LGBT regimes can use facial recognition to slap labels on people. Maybe those labels will be utterly wrong, given the questionable nature of the controversial prenatal hormone theory.
All we know for sure, right now, is that neural network/machine learning/facial recognition/computer vision researchers are like kids in a candy store, given the plethora of freely accessible facial images online. We might all want to bear that in mind when we post ours.