The current state of the art for a basic search engine is to index documents by keywords. When the user searches for those keywords, the documents containing them are found. The results are typically ranked by how frequently the keywords occur in each of the results. In some search engines, other criteria are considered in addition to keyword frequency.
One problem is that keyword frequency alone is not a great indicator of how interesting the document is. For example, if one types, “All work and no play makes Jack a dull boy” thousands of times and puts it on a Web page, it would rank high on a search for “play” or “Jack,” but it would not be a very interesting result.
One well-known page rank algorithm is that used by the popular Web search engine Google. Google's page rank algorithm relies on information about how frequently a document is referenced (linked to) from other documents. The rationale is that a document that is “linked to” by lots of other documents must be interesting, so its rank is increased as the number of such external references increases.
A disadvantage of Google's solution is that it uses derivative evidence to determine how interesting a document is. That is, it is not evidence that the person doing the search will find the document interesting. Rather, it is evidence only that a content author or a Web master (i.e. the person who creates the link, not the person who traverses the link) found the document interesting.