The invention relates to retrieval of information from databases and servers.
Today, Web search engines are critical components of the Internet infrastructure that drives the information economy. It is believed that every day approximately 60 terabytes of new content is added to the World-Wide Web. Unfortunately, a significant portion of searchers are frustrated and disappointed by the performance of search engines when it comes to their ability to deliver the right result at the right time. One important reason for this is that the information retrieval techniques that form the core of Web search engines are not so well suited to the reality of Web search. This may be because many of these techniques were originally developed for specialised search tasks by expert users, over limited document collections. As a result, these shortcomings lead to the following inter-related problems:
The Coverage Problem: the continued growth of the Web means that no single search engine can hope to provide complete coverage.
The Indexing Problem: the heterogeneous nature of Web documents and the lack of any reliable quality control make indexing extremely difficult.
The Ranking Problem: ranking results on the basis of weighted overlaps with query terms has proven to be unsatisfactory in Web search.
The Query Problem: the preponderance of poorly formed, vague queries means that most searches are under-specified to begin with.
Recent years have seen a number of key developments in Web search, many of which take specific advantage of the unique characteristics of the Web, and the particular way that Web users search for information. For instance, researchers recognised the advantages of combining the results of many individual search engines in a meta-search engine to achieve improved coverage and accuracy. More recently, information about the Web's topology (the connectivity of individual pages) has been incorporated into search engines as a way to recognise and rank authoritative pages. Others have looked at how clustering techniques can be used to organise a flat list of results into a more structured collection of topical clusters. While this does not solve the query problem, it at least helps the search engine to separate out the different meanings of a vague query into collections of topically related results.
That these developments have all tended to adopt a traditional information retrieval perspective in the sense that they seek to improve the manner in which documents are represented, retrieved or ranked, with focusing at the level of an individual search session.
The invention is directed towards reducing the number of iterations required for information retrieval.