PageRank™ is currently the most widely deployed and influential algorithm for the ranking of Web pages and is the basis for the Internet search engine used by Google™. More particularly, and as described in U.S. Pat. No. 6,285,999, PageRank is a link analysis algorithm that assigns numerical weights to each element of a set of hyperlinked documents (e.g., Web pages) in order to measure each document's relative importance within the set of documents. The assigned weight is called the PageRank of a document D, and is denoted PR(D).
Ranking is essential in determining which links should be displayed in what order as results responsive to user queries submitted via search engines. PageRank is one of the key values determining the order in which the result links should be displayed by computing a numerical value for each Web page. Other factors also influence relative order of results returned in response to a query, including the appearance of query keywords in the title of a Web page, headings, frequency of occurrences in the Web page, keyword appearances in the anchor text of the links on pages pointing to a given page, and other factors. The result links are then presented in a list in order of descending PageRank value.
PageRank, PR, of a Web page, pgi, is defined as follows:
      PR    ⁡          (              pg        i            )        =                    1        -        d            N        +          d      ⁢                        ∑          j                ⁢                                  ⁢                              PR            ⁡                          (                              pg                j                            )                                            L            ⁡                          (                              pg                j                            )                                          Pages pgj denote Web pages pointing to page pgi, in other words pages containing links to page pgi. N is the total number of Web pages and d is a damping factor, which for PageRank is the probability that a random surfer will continue clicking on links on Web pages.
It can be shown that PageRank can be viewed as the probability that a random surfer will land on a particular page by following links on Web pages and choosing a link on a current page at random. Such random surfer can be modeled as a Markov chain or Markov process. A random surfer can land on pages with no outgoing links, in which case the surfing cannot continue, or get stuck in a set of Web pages that form a cycle. In order to avoid such situations, at every step the random surfer can choose to jump on a random Web page with probability 1−d, where d is the damping factor, instead of following links. The random jumping and the damping factor are required to ensure the convergence of the process of computation of PageRank values.