The present invention is related to searching for information, for example, searching for information on the Internet. In particular, the present invention is related to searching for information guided by a set of topics such as keywords, wherein the set of topics is not necessarily hierarchical, and wherein during any particular search, any search hierarchy of topics is created on the fly.
It is known to search for information that may reside locally or that may be distributed in a network or internetwork, even distributed over the Internet. Google and Yahoo, for example, have become synonymous with searching the Internet for information. The results of such a search are an ordered set of URLs to Web pages on the World Wide Web (the “Web”) or other items of information.
It is also known to categorize information by attaching categories or keywords—called topics herein—to each item of information. Yahoo, for example, started as a directory of the Web that allowed one to search guided by such topics. Such prior art categorization is explicitly hierarchical, in that topics have subtopics, and so forth, such that the set of topics may be structured as in the form of a tree structure or a graph. One problem with such hierarchical categorization is that a once a first topic is selected, the only subtopics available for further searching are those children of the first topic. This may lead to missing some results, or to not being well directed using the categorizations.
Therefore, structuring topics with a strict hierarchy may lead to unsuccessful searches.
It also is known how to classify search results automatically into a topic of a hierarchical set of topics. U.S. Pat. No. 5,924,090 to Krellenstein and the Northern Light Search Engine product—see “Northern Light Enterprise Search Engine Overview White Paper,” dated Jun. 15, 2004, by Northern Light Group LLC, Cambridge, Mass., and also available online at www.northernlight.com—describes such automatic classification, but on a pre-defined hierarchical set of topics. The set of topics, however, is pre-defined with a hierarchy. If a non-hierarchical pre-defined set of topics is used, no hierarchy of topics is generated. It is desirable, however, to have a hierarchy of topics to guide a search. That is, after selecting either a search term or a topic, it is desirable to generate candidate topics to further refine the search without the need to have a predefined hierarchy among topics.
It is also known to cluster search results on the fly without an already defined set of topics. See for example the Vivísimo Clustering Engine™, made by Vivísimo, Inc., of Pittsburgh, Pa. This clustering engine automatically organizes search or database query results into meaningful hierarchical folders on the fly. The clustering engine transforms a list of search results into categorized information without any pre-processing of the source documents. The categories, however, are not pre-defined, but rather selected from the words and phrases contained in the search results themselves. Vivísimo's Clustering Engine does not use pre-defined subjects; its descriptions are created on the fly from the search results list. No hierarchy of topics is generated.
See also B. D. Davison, A. Gerasoulis, K. Kleisouris, Y. Lu, H. Seo, W. Wang and B. Wu: “DiscoWeb}: Applying Link Analysis to Web Search”, Proceedings of the Eighth International World Wide Web Conference,” Toronto, Canada, page 148, 1999. See also Krishna Bharat and Monika R. Henzinger: “Improved algorithms for topic distillation in a hyperlinked environment,” Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, Melbourne, Australia, pages 104-111, 1998 for a discussion of how to analyze Webpages and rank them according to relevance and clusters.
Thus there is a need in the art for a search method that includes classifying potential search results under topics, with the set of topics not necessarily hierarchical, but with a hierarchy of topics generated on the fly to guide the search.
Topic-guided searching is also known wherein after each search step, suggested topics for further searching are provided. For example, shopping Web sites such as BizRate.com, of Los Angeles Calif., are known that as a result of a search, suggest shopping topics for further search. These topics, however, are pre-determined and have a hierarchical structure. For example, a topic “Computers & Software” exists in BizRate.com, and under this topic is the topic “Digital Cameras.” Under “Digital Cameras” are several topics, such as the brand names Canon, Kodak, etc., the different resolutions ranged for digital cameras, etc. The topics have a hierarchical structure.
It is desired to provide the same guidance as provided in topic-guided searching, but wherein topics do not have a hierarchical structure.
There also is a need to provide the ability for a searcher, e.g., one who is registered (a “user”), to define new topics to add to the set of topics, and to define attachments between information items and the newly defined topic, and also previously defined topics.
Not all attachments between topics and information items are equally relevant. For example, one topic may be “better” or more applicable to a page on the Web than another. Thus there is a need in the art to measure the quality of an attachment between an information item and a topic.
Similarly, not all users are equally credible. Thus, there further is a need in the art to rate users according to a credibility measure.
There further is a need in the art for providing personalization for registered users. For example, a registered user may wish to have previous searches or previous traversals of topics, be recorded for re-use.