Search engines have enabled users to quickly access information over the Internet. Specifically, a user can submit a query to a search engine and peruse ranked results returned by the search engine. For example, a user can provide a search engine with the query “Spider” and be provided with web pages relating to various arachnids, web pages relating to automobiles, web pages relating to films, web pages related to web crawlers, and other web pages. Search engines may also be used to return images, academic papers, videos, and other information to an issuer of a query.
Operation of a search engine may include employment of web crawlers to locate and store a large amount of information (e.g., web pages) that is available on the World Wide Web. For example, web pages or information pertaining thereto may be stored in a search engine index, which is used (in connection with one or more search algorithms) when queries are received.
Conventionally a search engine index is stored in several tiers, wherein different tiers provide different levels of performance. The tiering of the search engine index is analogous to the memory hierarchy used in computer architecture: overall storage capacity of the index is divided between different levels that vary in size, speed, latency, and cost. Higher tiers of the index typically have higher speed but have smaller capacity and higher cost. Accordingly, it is desirable to carefully index web pages to maximize efficiency of the search engine.
One manner for tiering web pages that has been used is to select a tier of an index in which to place a web page as a function of the web page's relative importance as determined by some metric, such as a static rank of the web page. Specifically, a number of links to a web page may be used to select a tier of an index in which to locate the web page. The relative importance of the page, however, is not necessarily indicative of whether the page is frequently accessed, and thus may be suboptimal for indexing web pages in a search engine index. Evaluating tier assignment is a difficult problem, however, because it is unclear which metrics capture the quality of a particular allocation of web pages to the tiers.