In internet technology, web crawlers are used to find new web pages by collecting and following URLs (Uniform Resource Locators). By following an URL and downloading the corresponding web page the links within that web page can be added to the web crawler's URL collection. The web pages are stored for indexing and ranking by internet search engines. Internet search engines use web page ranking algorithms that relate the links within a web page to the relevance of the web page.
The use of link popularity algorithms to rank web pages has lead to the problem of “link farms”. In order to manipulate a web page's ranking, a large sub-web of interlinked web pages is created and linked to a web page so that the page receives a high search engine ranking. In addition to distortion of web page rankings, a problem with link farms is that a web crawler spends a lot of resources following links and collecting web pages for eventual indexing into a search engine, even though many of these pages are created only for page ranking and are not otherwise used by, nor useful for humans.
What is required is a system, method and computer readable medium that provides enhanced web crawling.