Bypassing CAPTCHA with Visually-Impaired Robots

As many of you have probably noticed, we rely heavily on bot automation for a lot of the testing that we do at Sociosploit. And occasionally, we run into sites that leverage CAPTCHA ("Completely Automated Public Turing Test To Tell Computers and Humans Apart") controls to prevent bot automation. Even if you aren't familiar with the name, you've likely encountered these before.

While there are some other vendors who develop CAPTCHAs, Google is currently the leader in CAPTCHA technology. They currently support 2 products (reCAPTCHA v2 and v3). As v3 natively only functions as a detective control, I focused my efforts more on identifying ways to possibly bypass reCAPTCHA v2 (which functions more as a preventative control).

How reCAPTCHA v2 Works

reCAPTCHA v2 starts with a simple checkbox, and evaluates the behavior of the user when clicking it. While I haven't dissected the underlying operations, I assume this part of the test likely makes determinations about the user's "humanness" based on variable haptics measured in mouse-over and click behaviors.

After clicking, if the CAPTCHA function still has suspicions that you might be a robot, it will give you an additional challenge. This requires you to evaluate multiple different images and classify them. And while I have seen many decent proof-of-concepts for bypassing image CAPTCHA challenges using Machine Learning toolkits like SciKit Learn or Tesseract, I personally did not have any interest in going down this road. If you have worked with Machine Learning, you likely know that in order to make a good ML classifier, you need massively large learning sets to "train" the ML classifier, and you also need to validate the quality of that input data. To avoid this significant amount of up-front effort, I decided the better option would be to avoid this visual challenge altogether. My first though was to attempt to bypass the first (checkbox) test by introducing variable haptics into automated mouse-over and click behaviors (something that I will still likely revisit in the near future).

However, while considering this prospect, I noticed that reCAPTCHA has an audio challenge option for the visually impaired. By selecting the "headphones" icon at the bottom of the reCAPTCHA challenge, you can opt to do the audio challenge instead. Having recently worked with Google's Speech Recognition API for a robotics project with my son (and having found that it works extremely well), I thought this might be an even easier option to bypass the CAPTCHA...and still avoid the visual challenge.

How the Vision Impaired Robot should operate?

So I didn't have to roll my own implementation, I decided to test against Google's own publicly available example of the reCAPTCHA v2 control. This can be found at the link below:

https://www.google.com/recaptcha/api2/demo

So once I had my target in mind...I started to outline how the bot should operate. The plan was to automate a bot to do the following:

Click the "I'm not a robot" checkbox
Upon being prompted to solve the challenge...click the Audio Challenge button at the bottom
Click the "Play" button to start the audio
Begin a short (5 second) local recording, to capture the audio in a WAV file
Immediately send that wave file up to the Google speech recognition platform for analysis
Once returned, supply the interpreted text to the input field and click "Verify"
Hope that Google's speech recognition software is stronger than their reCAPTCHA software :)

Some Initial Difficulties Encountered

I did run into some obstacles when originally testing. I pretty quickly learned that the odds are not in your favor if using browser automation. Specifically, for my initial testing, I used Selenium -- a browser automation library that connects directly to browser drivers. Apparently, the reCAPTCHA service can tell when it is running within a browser that has driver hooks. When going this route, it was not uncommon for the service to make me solve 5-10 challenges before it would let me through (other times...it wouldn't let me through at all).

I then decided to attempt to decouple my bot operations from the browser altogether and use OS-level automation. And this was the final key to success. Specifically, I ended up using the PyAutoIT library with pre-defined coordinates for relevant objects in the browser with which we would need to interact. Upon doing this, I found that our bot only had to solve a single CAPTCHA challenge each time to prove its "humanness".

Vision Impaired Robots CAN Pretend to be Human ¯\_(ツ)_/¯

Ultimately the efforts paid off. We were able to create a proof-of-concept that was consistently able to bypass the CAPTCHA by completing a single audio challenge. A video demonstration and the final proof of concept code are available for reference below.

Building Bots with Mechanize and Selenium

The Sociosploit team conducts much of its research into the exploitation of social media using custom built bots. On occasion, the team will use public APIs (Application Programming Interfaces), but more often than not, these do not provide the same level of exploitative capabilities that could be achieved through browser automation. So to achieve this end, the Sociosploit team primarily uses a combination of two different Python libraries for building web bots for research. Each of the libraries have their own advantages and disadvantages. These libraries include: Mechanize Pros: Very lightweight, portable, and requires minimal resources Easy to initially configure and install Cuts down on superfluous requests (due to absense of JavaScript) Cons: Does not handle JavaScript or client-side functionality Troubleshooting is done exclusively in text Selenium Pros: Operations are executed in browser, making JavaScript rendering and manipulation easy Visibility of browse

SocioSploit

Search This Blog