CAPTCHA

Identifying humans

Paul Krzyzanowski

April 25, 2024

Introduction

CAPTCHA (Completely Automated Public Turning test to tell Computers and Humans Apart) is not a technique to authenticate users but rather a technique to identify whether a system is interacting with a human being or with automated software. This concept is used primarily to prevent automated software or bots from performing actions such as spamming, creating lots of accounts, or stealing large amounts of content from websites.

The main idea behind CAPTCHA is to create a test that computers find difficult to solve, but that humans can solve easily.CAPTCHAs typically involve tasks like identifying distorted text, solving simple puzzles, or recognizing objects in images, which require perceptual and cognitive skills that are still challenging for AI to mimic effectively.

History

The history of CAPTCHAs began in the late 1990s when the Internet started facing significant issues with automated bots. The first CAPTCHA system was developed by researchers at AltaVista in 1997 to prevent automated URL submissions to their search engine, which were skewing search engine rankings. This system, developed by Andrei Broder and his colleagues, involved asking users to identify and type the characters that appeared in a distorted image.

Shortly thereafter, in 2000, a more formalized version of the CAPTCHA was created by Luis von Ahn, Manuel Blum, Nicholas Hopper, and John Langford from Carnegie Mellon University. They termed their invention “CAPTCHA” as an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart.” This innovation was aimed at enhancing security systems and ensuring that the users were human by requiring them to solve tasks that computers were unable to perform efficiently at the time, such as recognizing distorted text. This system was focused on presenting distorted words from a relatively small set of words.

Around the same period, a similar concept called “BaffleText” was developed independently by researchers at Palo Alto Research Center. BaffleText focused specifically on preventing automated scripts from accessing web services by presenting randomly generated, linguistically based strings of text that were visually distorted.

These early developments laid the groundwork for various forms of CAPTCHAs used across the web today, evolving in complexity to counter increasingly sophisticated bots.

Problems with CAPTCHAs

Man-in-the-middle attacks

CAPTCHAs were susceptible to a form of a man-in-the-middle attack where the puzzle is presented to low-cost (or free) humans whose job is to decipher CAPTCHAs. These are called CAPTCHA farms and their goal is to apply pools of human labor to help the bots (i.e., do the tasks that we can’t script effectively).

In this attack, when the bot is presented with a CAPTCHA test, it forwards the request to a CAPTCHA farm so it can be completed by a human. The human-generated response is sent back to the bot, which can present it to the website.

Accessibility

Traditional CAPTCHAs, particularly those that require users to decipher distorted text or images, can be challenging for individuals with visual impairments. Audio CAPTCHAs were created as an alternative but are often challenging for those with hearing impairments (and often challenging for anyone).

Improved image recognition

As machine learning and artificial intelligence technologies have advanced, so has the capability of bots to solve CAPTCHA challenges. Algorithms can interpret distorted text, recognize objects in images, and even solve audio CAPTCHAs, reducing their effectiveness as a security measure. This led to a race to create ever-more challenging puzzles.

User frustration

As image and audio processing abilities advanced and CAPTCHAs became more challenging, their presence degraded the user experience. Not only is it an extra step in interacting with a service but it turned into one that presents humans with tests that they often fail, leading to multiple attempts to solve and possible abandonment.

Evolution

Ever-improving OCR technology also made text-based CAPTCHAs susceptible to attack. By 2014, Google found that they could use AI techniques to crack CAPTCHAs with 99.8% accuracy.

Getting value out of CAPTCHA: reCAPTCHA

reCAPTCHA is a variant of CAPTCHA that not only challenges users to prove they are human but uses the interaction to help digitize text, annotate images, and build machine learning datasets. It takes advantage of OCR (optical character recognition) situations that algorithms struggle with.

The initial version of reCAPTCHA was developed by Luis von Ahn and his team at Carnegie Mellon University and later acquired by Google in 2009. This version presented users with two words—one that the computer knew and one that it didn’t. By solving these CAPTCHAs, users were helping to digitize books, newspapers, and old radio shows. Google then used this not just for digitizing content but also parsing things such as house numbers in Google Street View.

Beyond text recognition

An alternative to text-based CAPTCHAs are CAPTCHAs that involve image recognition, such as “select all images that have mountains in them” or “select all squares in an image that have street signs.” This can add a layer of difficulty for programmatic solving since it requires parsing the request and solving the problem where the image is broken up into blocks and blocks may hold part of the searched pattern (e.g., part of a street sign).

Other solutions involve dragging a puzzle piece into place or rotating an object to a correct alignment.

noCAPTCHA

A more recent variation of CAPTCHA is Google’s No CAPTCHA reCAPTCHA. This simply asks users to check a box stating that I’m not a robot. This provides a user-friendly interaction in that there is no puzzle to solve. Behind the scenes, however, the system performs a complex analysis of the user’s engagement with the CAPTCHA checkbox and the entire website. It examines cues such as the user’s mouse movements, IP address, session duration, and cookies that are indicative of typical human-driven behavior versus automated scripts.

If the initial risk analysis is inconclusive—perhaps because the user behavior is unusual or indicative of potential automation—the system may present additional challenges. These are similar to traditional CAPTCHAs and may include image recognition tasks or more complex puzzles.

invisible reCAPTCHA

The latest variation of this system is the invisible reCAPTCHA. The user doesn’t even see the checkbox: a frame is oriented tens of thousands of pixels above the origin, so the JavaScript code is run, but the reCAPTCHA frame is out of view. If the server-based risk analysis does not get sufficient information from the Google cookies then it relocates the reCAPTCHA frame back down to a visible part of the screen.

As with noCAPTCHA, if the risk analysis part of the system fails, the software presents a CAPTCHA (recognize text on an image) or, for mobile users, a quiz to search for items within an image.

References

Are you a robot? Introducing "No CAPTCHA reCAPTCHA, Google Search Central Blog, December 3, 2014.

Last modified May 1, 2024.
recycled pixels