A system (such as a website) can allow or deny access through Robot Exclusion Protocol (REP). A system employing REP utilizes a text file, accessible to other systems that connect over the Internet, that instructs robots accessing the text file not to access the website. After reading the file, the robot does not access the system/website, assuming the robot is compliant with REP.
However, not all robots comply with REP. For those non-compliant robots, detection by websites typically relies on Completely Automated Public Turing test to Tell Computers and Humans Apart (CAPTCHA). A CAPTCHA uses an image which contains a disjointed and/or distorted alphanumerical sequence. The system prompts a user to recognize the alphanumerical sequence and to input it using the user's keyboard.
For many years, a robot could not employ alphanumerical recognition/Optical Character Recognition (OCR) technology to recognize the alphanumerical sequence in the CAPTCHA successfully. Now, a robot can employ OCR technology to recognize alphanumeric sequences in images provided by CAPTCHA. For example, the CAPTCHA of Windows Live™ can be cracked in under one minute. As OCR technologies advance, the CAPTCHA approach to discriminate between a human user and a robot becomes less effective.
A sensitive website, such as a bank website, can receive from 10,000-100,000 attacks per hour from robots. Early determination of whether an access attempt is from a human user versus a robot is necessary to enable the human user to access the website and to block a robot. Such a discrimination can diminish non-human and potentially harmful requests.
Another difficulty is distinguishing a human user from a robot using a test that a human user can successfully and consistently pass in a limited amount of time.