Any discussion of the prior art throughout the specification should in no way be considered as an admission that such prior art is widely known or forms part of common general knowledge in the field.
Internet search engines have become a significant part of the Internet landscape. Search engines provided by Google, Yahoo, Microsoft etc attempt to provide comprehensive and rapid search capabilities for users attempting to find information on particular topics that form part of the labyrinth of the Internet or private intranets.
These search engines normally include three main parts. The first is a gathering mechanism collecting materials that will form part of the index. The second is an indexer for comprehensively indexing the gathered material, often by keywords, to produce a readily searchable inverted index of key words or phrases. The third part is a querying process for querying the inverted index and presenting the results of the query to a user.
For example, a classic description provided by Sergey Brin and Lawrence Page, “The anatomy of a large-scale hypertextual Web search engine”, in Proceedings of WWW7, pages 107-117. It is assumed the skilled person in the field is readily familiar with the construction of search engines.
As part of the indexing process, a relative document importance is assigned to the material that has been crawled. Various mechanisms for relative assignment are known and can depend on the perceived value of the document and the perceived importance of the words within a document. One well known technique for assigning importance is the Page Rank algorithm.
It is advantageous to a search engine that the indexer provides an efficient mechanism for indexing the crawled material, capable of supporting rapid and effective query responses.