Google revamped its reCAPTCHA system, used to block automated scripts from abusing its online services, just hours before a trio of hackers unveiled a free system that defeats the widely used challenge-response tests with more than 99 percent accuracy.
Stiltwalker, as the trio dubbed its proof-of-concept attack, exploits weaknesses in the audio version of reCAPTCHA, which is used by Google, Facebook, Craigslist and some 200,000 other websites to confirm that humans and not scam-bots are creating online accounts. While previous hacks have also used computers to crack the Google-owned CAPTCHA (short for Completely Automated Public Turing test to tell Computers and Humans Apart) system, none have achieved Stiltwalker’s impressive success rate.
“The primary thing which makes Stiltwalker stand apart is the accuracy,” wrote Adam, one of the three hackers who devised the attack, in an e-mail. “According to the lead researcher from the Carnegie Mellon study, the system we attacked was believed to be ‘secure against automatic attack,’” he added, referring to this resume from a Carnegie Mellon University computer scientist credited with designing the audio CAPTCHA.
Stiltwalker’s success exploits some oversights made by the designers of reCAPTCHA’s audio version, combined with some clever engineering by the hackers who set out to capitalize on those mistakes. The audio test, which is aimed at visually impaired people who have trouble recognizing obfuscated text, broadcasts six words over a user’s computer speaker. To thwart word-recognition systems, reCAPTCHA masks the words with recordings of static-laden radio broadcasts, played backwards, so the background noise would distract computers but not humans.
What the hackers—identified only as C-P, Adam, and Jeffball—learned from analyzing the sound prints of each test was that the background noise, in sharp contrast to the six words, didn’t include sounds that registered at higher frequencies. By plotting the frequencies of each audio test on a spectrogram, the hackers could easily isolate each word by locating the regions where high pitches were mapped. reCAPTCHA was also undermined by its use of just 58 unique words. Although the inflections, pronunciations, and sequences of spoken words varied significantly from test to test, the small corpus of words greatly reduced the work it took a computer to recognize each utterance.