Researchers have created an automated system to solve Google’s reCAPTCHA auditory challenges.
Poor, poor prove-you’re-a-human reCAPTCHA tests – also known as Completely Automated Procedures for Telling Computers and Humans Apart – they get no respect!
The point of reCAPTCHA challenges is to act as a gate that lets humans through but stops or slows down bots (software robots), so a bot that can solve a CAPTCHA automatically defeats the whole object of reCAPTCHA. And yet, that’s precisely what keeps happening. There are three kinds, and they’ve all been automatically kicked over by researchers:
- Image Challenge: when Google makes you select related images from a set.
- Audio Challenge: when you need to enter numbers that are read out loud.
- Text Challenge: when you need to pick all the phrases that match a given category.
No. 1, the image challenge, was gamed last year when researchers used Google’s own massive image search database in reverse, finding words to match an image, rather than images to match a word, to help them find images in a reCAPTCHA set that shared a particular characteristic.
Then, the audio challenge purportedly fell for the first time in March, stumbling on one of Google’s own services: this time, it was Google’s speech recognition API.
A security researcher identifying him-/herself only as East-Ee Security claimed to have discovered what they called a “logic vulnerability” that allowed for easy bypass of Google’s ReCaptcha v2 anywhere on the web.
Now, we have another auditory CAPTCHA smackdown: University of Maryland researchers say they’ve created what they’re calling unCaptcha: an automated system to solve auditory challenges with a success rate of about 85%.
It isn’t the first defeat for audio reCAPTCHA, instead unCaptcha is designed to prove that beating Google’s bot challenge is practical and cheap:
unCaptcha combines free, public, online speech-to-text engines with a novel phonetic mapping technique, demonstrating that it requires minimal resources to mount a large-scale successful attack on the reCaptcha system.
The system starts by slicing up an audio challenge and sending each piece to multiple online speech-to-text services. The answers from each service are treated as votes for the right answer, with votes weighted according to the phonetic similarity of different words, and each service’s typical accuracy. The final answer is assembled from the winning slices.
For more details, check out the code on GitHub and the researchers’ paper (PDF), titled unCaptcha: A Low-Resource Defeat of reCaptcha’s Audio Challenge.
As the University of Maryland researchers note, reCAPTCHA doesn’t just challenge our kitten-image, spoken-word or garbly text recognition. It also observes our subtle signs of humanity: say, how we type, how we move a mouse, and so on. Sometimes, those subtle clues aren’t enough for reCAPTCHA to discern whether we’re human, and that’s when the picture grids pop up.
Likewise, users who are visually impaired can keep hitting reload until they’re presented with a microphone to get the audio challenge instead. Mobile users are presented with a grid of images and instructed to select those that match a given challenge word.
The tracking of subtle clues such as mouse movements has been around since 2013, when Google revealed what it called its Advanced Risk Analysis backend for reCAPTCHA.
CAPTCHA challenges aren’t the be-all and end-all: rather, they’re meant as a stumbling block, to slow down bots as much as possible.
Whether they’re capable of being automatically defeated or not hopefully won’t mean too much, if Google has its way. It’s been working on Invisible reCAPTCHA: a free service that uses its advanced risk analysis technology, combined with machine learning, to separate humans from bots. That means no more need for us to click on anything at all.
Google made Invisible reCAPTCHA available for website developers in March 2017. No details on how it works, but Google said in June 2017 that “millions” of users have passed through with zero clicks every day.
It didn’t say how many bots had passed through.