A human interaction proof (HIP), which is sometimes referred to as a “captcha,” is a mechanism that is used to distinguish human users from robots. Many services that are available on the web—e.g., e-mail, blogs, social networks, access to patent databases, etc.—are gated by captchas. In a typical captcha scheme, letters and numbers are displayed on a screen as graphics in some way that is designed to obscure the letters and numbers. A user has to type the letters and numbers into a box as a form of proof that the user is human. The theory behind captchas is that recognizing symbols that intentionally have been obscured is a hard problem that demands the flexibility of the human brain. Thus, captchas are something akin to an applied Turing test.
A problem that arises with captchas is that they can be broken in various ways. Once a particular captcha scheme has been in use for some amount of time, the obscured symbols become recognizable in the sense that optical character recognition (OCR) systems can be trained to recognize them. OCR is thus an automated way of breaking captchas, and it can work as long as there is enough data on which to train the OCR. The training data can be generated by human captcha solvers, or can even be generated just by guessing solutions and analyzing which guesses succeed and which ones fail. Since captchas themselves can be used as training data, for as long as a captcha scheme is in use it continues to generate training data that can be used to break the scheme. Thus, captcha schemes generally have a limited shelf life, after which they are likely to have been broken. In addition to OCR, another way to break a captcha scheme is to use inexpensive human labor to solve captchas. Captchas can be transmitted electronically anywhere in the world (including places where labor is inexpensive), and teams of people can be employed to solve captchas. The solved captchas can be used in real-time, or the solutions can be stored and used as training data for OCR systems, thereby allowing human breaking to feed the process of automated breaking.
Since captchas are used to ensure, probabilistically, that services are being used by humans rather than machines, in order for captchas to continue to serve their intended purpose, the captcha schemes often have to be changed. But changing the captcha scheme involves designing and testing a new scheme, which can be labor intensive. Thus, new captcha schemes generally are not designed and deployed as frequently as they could be.