Web crawlers attempt to continually retrieve (i.e., scan) content of specific websites, either by following HyperText Markup Language (HTML) links, or by generating such links dynamically, based on heuristic rules. Owners of website content who are not interested in being scanned and identified detect and block such scanning attempts. Such detection exploits a feature of conventional web crawlers whereby a high number of similar HyperText Transfer Protocol (HTTP) requests originate from the same IP address (i.e., the IP address of the computer on which the web crawler is running, or the IP address of that computer's Network Address Translation-enabled router). The aforementioned owners identify the IP address from which the large number of HTTP requests are originating and block subsequent HTTP requests from the identified IP address, thereby frustrating subsequent scanning. Scanning of a website whose owners are employing web crawler detection and blocking is needed, for example, if a law enforcement agency is attempting to locate different types of illegal or potentially dangerous content available via the Internet. Thus, there exists a need to overcome at least one of the preceding deficiencies and limitations of the related art.