Internet searches can be conceptually viewed as comprising two components: one is indexing the web pages and their contents, and the second is ranking (ordering) the pages according to relevance to a given query. Ranking is usually based on a combination of a grade for the textual match (or information retrieval score) between the query and the page, and a grade for page “importance.” Page importance is typically calculated by the structure of the web pages on a web site (roughly speaking, a page pointed to by many other pages is considered important).
The disadvantages in algorithms that are based on analyzing the link structure in the web is that they are not applicable in situations where the link structure is non-existent (i.e., search plain text as opposed to searching hyper-text), as is the case when searching books or other documents without links or when searching a company's internal web (a corporate intranet search). Experience also shows that the link structure in corporate intranets is not good for such link analysis. Another problem with the link-analysis approach is that it is typically slow to respond to dynamic changes, because updating web documents to reflect changes in preferences (including the appearance of good new pages) is a cumbersome and slow process.
In the case of searching a company's internal web (or intranet searches in general), successful Internet search engines (such as Google) provide less than satisfactory results. Thus there is a need for improved intranet searching.