A web site is a directory of files stored on a web server or several web servers that may be accessed by a client over a network (e.g., the Internet). Both individual users and non-human programmatic sources (referred to as “robots”) may request access to a web server. Individual users who access a web server according to the intended presentation of the web site are referred to as “direct users”. Direct users often purchase items or services from the web site and view advertisements and sponsorships displayed in the web site. For these reasons, and others, access to a web server by direct users is highly desirable. Direct users represent the primary source of revenue for companies that operate web sites.
Robots, on the other hand, retrieve and index documents contained within web sites and often deliver these documents elsewhere. Robots, which are also referred to as “spiders” or “web crawlers”, may be server-based or client-based and are employed for a variety of reasons, some legitimate and many fraudulent. Robots can also be part of computer viruses, making the source of the activity difficult to track or control. Robots impose a cost on companies (both in terms of infrastructure to support the web site and whatever licensing costs are involved in presenting the content of a web page) while defeating most of the mechanisms by which a company attempts to make a profit.
Robots are often used by search engines to maintain an index of web sites. Legitimate robots follow conventions that allow web sites to mark pages, directories, or whole sites as “off limits”; pernicious robots ignore these conventions. There is a keen financial interest in minimizing access to a web server by pernicious robots.