The present invention is related to the field of information retrieval, and in particular to the field of searching for information on network based information systems such as the World Wide Web.
Searching for information in the ever-increasing universe of electronic information, for example as found on the World Wide Web (hereinafter referred to as the Web), can be both satisfying and frustrating. Making sense of very large collections of linked documents and foraging for information in such environments is difficult without specialized aids. Two sorts of aids have evolved to assist seekers of information. The first are structures or tools that abstract and cluster information in some form of classification system. Examples of such would be library card catalogs and the Yahoo! Web site. The second are systems that attempt to predict the information relevant to a user""s needs and to order the presentation of information accordingly. Examples would include search engines such as Lycos, which take a user""s specifications of an information need, in the form of words and phrases, and return ranked lists of documents that are predicted to be relevant to the user""s need.
Another class of tools is recommender systems. Recommender systems provide a list of recommended subsequent web pages worth viewing based on some predetermined filtering criteria. One such recommender tool is the xe2x80x9cRecommendxe2x80x9d feature provided on the Alexa Internet Web site. The xe2x80x9cRecommendxe2x80x9d feature provides a list of related Web pages that a user may want to retrieve and view based on the Web page that they are currently viewing.
Another recommender system is termed xe2x80x9cthe Knowledge Pumpxe2x80x9d and is described by Glance, N., Arregui, D. and Dardenne, M. xe2x80x9cKnowledge Pump: Supporting the Flow and Use of Knowledgexe2x80x9d in Information Technology for Knowledge Management. Eds. U. Borghoff and R. Pareschi, New York: Springer-Verlag, pp. 35-45, 1998. The xe2x80x9cKnowledge Pumpxe2x80x9d was designed for use within organizations and has a key focus on the sharing of information in the form of documents.
A characteristic of recommender systems is that they are collaborative in the sense that they utilize knowledge gained from prior queries in making recommendations. In general, collaborative methods of searching seek to utilize the results of previous related queries. Collaborative methods for searching are still in their infancy. However, on the Web search engines are beginning to incorporate simple techniques for collaborative search. The DirectHit Internet Web site has built a popularity engine, which operates using a very simple voting mechanism. Search engines that employ this popularity engine simply track the queries input by users and the links that the users follow. Users vote based on their subsequent viewing of the results, so that in the future, the same query will yield results whose ordering takes into account previous users"" actions. Thus entering a query into a search engine that employs DirectHit""s popularity engines will return the most popular results for that query. DirectHit also has a related search technology that works by either broadening or narrowing the user""s query (using a subset or superset, respectively, of the user""s query terms).
The aforementioned Alexia recommender system works similarly, but to augment browsing rather than searching. When a user browses a Web page, similar pages are recommended to the user. The similar pages are obtained by tracking which pages other users have visited after visiting the page currently being viewed. Recommendations are personalized as Alexia builds a profile for each user on the basis of users"" ratings of pages.
Some researchers in the information retrieval community have also addressed collaborative search. See for example the following documents: xe2x80x9cDocument Vector Modification in The Smart Retrieval System: Experiments in Automatic Document Processing,xe2x80x9d Ed. Gerald Salton, Englewood Cliffs, N.J., 1971, Fitzpatrick; Larry and Dent, Mei xe2x80x9cAutomatic Feedback Using Past Queries: Social Searching?xe2x80x9d In Proceedings of SIGIRxe2x80x297, Philadelphia, Pa., 1997; and Raghavan, V. V., Sever, H., xe2x80x9cOn the Reuse of Past Optimal Queries,xe2x80x9d Proc. of SIGIR95, Seattle, Wash., 1995. Their approach has been to use similar past queries to automatically expand new queries, a kind of second order relevance feedback (i.e., documents that correspond well to similar queries provide feedback on the original query as well). The measure of similarity between queries is a function of the overlap in documents returned by the queries. In all cases, the documents are analyzed linguistically to produce term-frequency vectors which are then combined with query term-frequency vectors to improve the search process. These augmentation procedures are very costly and generally increase the cost of a search greatly. As a result, these methods are viewed critically by on-line search systems.
The present invention describes a method and system to facilitate searching for information from network attached information sources as may be found on the world wide web. The present invention takes advantage of the collective ability of Web users to create queries. First, a graph is constructed of all queries submitted to a search engine(s) within a given period of time. Each node is a query. A link is created between two nodes whenever the two queries are judged to be related. The determination of relatedness depends on the documents returned by the queries, not on the actual terms in the queries themselves. For example, a criterion for relatedness could be that for the top ten documents returned for each query, the two lists have at least one document in common. When a new query is received, queries that are related are identified. Further described is a way to allow the user to peruse the network of related queries in an ordered way: following a path from a first cousin, to a second cousin to a third cousin, etc. to a set of results.
The method of the present invention is generally comprised of the following steps: obtaining a set of queries; creating a graph of related queries based on the set of queries; responsive to a user query, identifying related queries based on the graph of related queries; and presenting the related queries to the user for their selection.