The Internet presently comprises billions of web pages interconnected via hyperlinks. Users of the Internet typically use web browsing applications (“browsers”) to navigate among these pages by either selecting and clicking hyperlinks, or by manually entering a “Uniform Resource Locator” (“URL”) which allows the browser to access a particular web page directly. Often times, however, a user wishes to search the Internet for pages containing particular items of information. Because of the size of the Internet, it is impractical for a user to manually browse the Internet searching for relevant pages. Instead, users typically invoke search engines, which are computer applications developed for the purpose of searching the Internet. Search engines typically reside on server computing devices and accept queries from client users. A search engine is usually associated with an index of web pages, and, in response to a user query, returns a list of pages satisfying the query.
Some modern search engines rank web pages in order to provide users with more relevant results. Many search engines represent the interconnection of web pages via a matrix, and finding a page ranking equates to finding the principal eigenvector of the matrix. Such a search engine is described by Page et al. in “The PageRank citation ranking: Bringing order to the web,” Stanford Digital Libraries Working Paper, January 1998, which is hereby incorporated by reference in its entirety for all that it teaches without exclusion to any part thereof. Generally, an iteration takes a ranking of the web pages and propagates it across the interconnection matrix, to obtain an updated ranking for the pages. Eventually, the rankings for all pages converge to fixed values, which are the entries of the principal eigenvector. This is equivalent to calculating the stationary distribution of a Markov chain. Due to the size of the matrices, computing the eigenvector—and thus the page ranks—is a computationally intensive task in existing systems, requiring several iterations of matrix manipulation before values for all pages converge to the eigenvector.
In order to compute the page rank more efficiently, researchers have attempted to exploit particular mathematical properties of the interconnection matrix in order to find methods of computing or approximating page rankings more quickly. One such method is described by Kamvar, et al. in, “Adaptive Methods for the Computation of PageRank,” in Numerical Solution of Markov Chains, pp. 31-44, 2003, which is hereby incorporated by reference for all that it teaches without exclusion to any part thereof. Kamvar et al. note that, during the iterative process of finding the eigenvector, the page rankings converge quickly for some pages, but take longer for others. They provide a method to speed up the computation of page rankings by not computing page rankings for those pages that have apparently already converged, based on the assumption that when a page rank only slightly changes from one iteration to the next, that its rank will only slightly change in the future. Since a large percentage of the operations of calculating the stationary distribution include calculations when the changes are small, eliminating these calculations greatly increases the efficiency of the process. However, to address this possibility of “misconvergence”, Kamvar et al. describe a heuristic method of pruning the link structure at every few iterations. Because their method is a heuristic, it cannot guarantee that presently small changes do not become large changes later. The method of Kamvar et al. does not converge monotonically, so that during computation, page rankings may move very little, and then very much—as a result, some updates to page rankings may be ignored to the detriment, resulting in inaccurate page rankings. The method of Kamvar et al. additionally requires processing the entire graph every few iterations, which can decrease overall performance. Furthermore, the method of Kamvar et al. requires the matrix multiplications to be performed sequentially—it does not allow for the iterated matrix multiplications to be performed in a distributed, asynchronous or incremental manner.