Since their adoption as the primary way to distinguish legitimate users from computers, many types of Completely Automated Public Turing-test to tell Computers and Humans Apart (CAPTCHA) have been developed and proposed. The most basic type of CAPTCHA are question based, such as asking the user to solve a simple math problem (Shirali-Shahreza and Shirali-Shahreze, Question-Based CAPTCHA IEEE International Symposium on Signal Processing and Information Technology, December, 13-15, 2007) or asking a question about the question (“how many letters are in the third word of this question?”) (http://ha.ckers.org/blog/20070822/good-articles-on-captchas/#comments, viewed June 2008). Sound-based CAPTCHA generate some kind of sound, typically speech, to be interpreted and, perhaps, understood. Examples would be using text-to-speech (TTS) to generate a simple math problem (Shirali-Shahreza and Shirali-Shahreze, CAPTCHA for Blind People, IEEE International Symposium on Signal Processing and Information Technology, pp. 995-998, December, 15-18, 2007) or words to be interpreted or replicated (von Ahn, et al, CAPTCHA: Using hard AI problems for security, Proceedings of Eurocrypt, 2003).
The most common form of CAPTCHA is the visual CAPTCHA. These tests typically present the user with one or more image and ask a question about the images. Similar to question-based CAPTCHA, visual CAPTCHA adds the extra step of image recognition and interpretation into the test. Examples of visual CAPTCHA are BONGO (L. von Ahn, et al. CAPTCHA: Using hard AI problems for security, Proceedings of Eurocrypt, 2003) which displays a series of blocks and asks the user to specify what sets one of the blocks apart from the others. Other kinds of CAPTCHA include those in Liao (A Captcha mechanism by exchange image blocks. Proceedings of the 18th International Conference on Pattern Recognition 1179-1183, 2006) which display a familiar image (such as a face) with whole blocks of the image swapped and asks the user to identify the image; PIX disclosed by von Ahn, et al. (CAPTCHA: Using hard AI problems for security. Proceedings of Eurocrypt, 2003) which displays a series of similar objects and asks the user to identify what they are; Asirra (A CAPTCHA that exploits interest-aligned manual image categorization. Proc. of ACM CCS 2007, 366-374) which displays twelve photos of cats and dogs and asks the user to identify the cats; and KittenAuth (KittenAuth, http://www.thepcspy.com/kittenauth, viewed June 2008) which displays nine images of “cute” animals and asks the user to identify the kittens. Because part of the definition of CAPTCHA is “public”, any CAPTCHA that relies on a secret database of images is not considered to be a valid CAPTCHA, since any database is vulnerable to phishing attacks. Asirra uses the ever-changing images database of Petfinder.com, thus fulfilling the accepted definition of CAPTCHA.
A more advanced form of a visual CAPTCHA is an interactive CAPTCHA. This typically involves the user interacting with the CAPTCHA in some way. For example, Shirali-Shahreza and Shirali-Shahreza. (Drawing CAPTCHA. Proceedings of the 28th International Conference Information Technology Interfaces (ITI 2006), Cavtat, Dubrovnik, Croatia, 475-480, Jun. 19-22, 2006) proposes a Drawing CAPTCHA which proposes a “connect the dots” solution that requires the user to locate three unique points and connect them with clicks of the mouse. Rui and Liu propose HID, which displays a distorted image of a human face and asks the user to click on the corners of the eyes and mouth. (Excuse me, but are you human? Proceedings of the 11th ACM International Conference on Multimedia (Berkeley, Calif.), ACM, New York, 462-463, 2003). Interactive puzzles typically require the user to understand some kind of instruction about the applet before executing the instruction.
Another advanced form of visual CAPTCHA is an animation-based CAPTCHA, which generate a random animation and presents the user with some kind of task. Athanasopoulos and Antonatos, Enhanced captchas: Using animation to tell humans and computers apart Proceedings of the 10th IFIP Open Conference on Communications and Multimedia Security, October 2006, propose using an interactive form that requires the user to click on moving objects.
The most common form of a CAPTCHA seen on the web today is the optical character recognition based (OCR-based) CAPTCHA, a variation of visual CAPTCHA. OCR-based CAPTCHA presents the user with a randomly generated image of distorted text to be replicated in a text box. An example of OCR-based CAPTCHA is GIMPY, an implementation upon which most OCR-based CAPTCHA (Yahoo!, Google, Microsoft, etc.) are based. Kolupaev and Ogijenko, CAPTHCAs (Human vs. Bots. IEEE Security and Privacy, 6:1, 68-70, January-February, 2008).
Breaking OCR-based CAPTCHA is a computer vision problem, since AI will be applied to locate and recognize cluttered and distorted text. Depending on the implementation, this involves recognizing letters amidst a cluttered background, letters that are warped and/or touching each other, and letters that are segmented with additional lines. Much research has gone toward recognition amidst clutter (Detecting Boundaries in Natural Images, http://www.cs.berkeley.edu/˜fowlkes/project/boundary/index.html, viewed June 2008; Martin, et al., Learning to detect natural image boundaries using brightness and texture, Advances in Neural Information Processing Systems, vol. 14, 2002; and Martin, et al., Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5):530.549, 2004) and the specific breaking of OCR-based CAPTCHA through research and open community efforts (Mori and Malik, Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA, CVPR, 2003; and PWNtcha: The Open-source CAPTCHA Breaker. http://libcaca.zoy.org/wiki/PWNtcha, viewed June 2008). Wang et al., CAPTCHA challenge tradeoffs: Familiarity of strings versus degradation of images. Proceedings of the 8th International Conference on Pattern Recognition, 164-167 Aug. 20-24, 2006 discussed the ease of use of OCR-based CAPTCHA from a human stand point, finding that humans have a much easier time reading distorted text when it contains familiar words instead of random characters. The most effective tests, however, use random characters to increase effectiveness against dictionary-based attacks (Mori and Malik, supra).
Today's most powerful OCR-based CAPTCHA, that takes advantage of the distortion and squeezing of text, used by Google, Yahoo, and Microsoft, are starting to be cracked with bots with a 20% success rate against Google (SecurityLabs.Websense.com, Google's CAPTCHA busted in recent spammer tactics, Posted on Feb. 22, 2008 by Sumeet Prasad) and a 90% success rate against Microsoft's segmentation method (Hotmail, Windows Live, etc.) with an estimated 60% effectiveness of a full crack implementation (Yan and Ahmad. A Low-cost Attack on a Microsoft CAPTCHA. Technical Report, School of Computing Science, Newcastle University, UK. April 2008).
Poor server-side implementation of OCR-based (and other) CAPTCHA systems is a common pitfall. Basing CAPTCHAs off of a standard library of images, poor/predictable naming schemes, and poor/no session-tracking all contribute to the cracking of CAPTCHA systems without the use of computer vision (Xato.com. These CAPTHCAs Are Just Not Working Out. Posted on Aug. 21, 2007 by mb. http://xato.com/bl/2007/08/21/these-captchas-are-just-not-working-out, viewed June 2008). Honeypots also add to the security of the base implementation (Nedbatchelder.com. Stopping spambots with hashes and honeypots. Posted on Jan. 21, 2007 by Ned Batchelder. http://nedbatchelder.com/text/stopbots.html, viewed June 2008). The use of humans to farm answers to CAPTCHAs can be a problem to systems with poor session management and static databases (Ha.ckers.org. Human CAPTCHA Breaking. Posted on Tuesday, Mar. 11, 2008. http://ha.ckers.org/blog/20080311/human-captcha-breaking, viewed June 2008).
In sum, the available CAPTCHAs are broken by attackers with increased frequency by using sophisticated techniques. CAPTCHAs are becoming tougher for humans to solve in order to defeat sophisticated attackers, thereby reducing the utility of the CAPTCHA itself.