Information retrieval has become an increasingly complex task in the electronic and computer arts in view of the recent advancements in technology as well as the proliferation of the Internet. Some estimates have indicated that the World Wide Web (WWW), the fastest-growing segment of the Internet, has increased at a rate of three-thousand percent every year. Irrespective of the actual rate of growth, it suffices to say that the increase has enriched the amount of information that is available to an Internet user. As a practical matter, the increase in information also has a tendency to increase the number of topics that are similarly related, which can complicate identifying pertinent material on the Internet. For example, a user may desire to find information related to Long Island, N.Y. More specifically, the user may desire to find information related to summer vacation activities. If a user merely attempts to search for information related to Long Island, N.Y., the information likely to be attained will be voluminous, and the vast majority of it unrelated to summer vacation activities. Thus, a user will then have to invest additional time searching through the returned results in order to pick out what is relevant. Conversely, if the user attempts a narrowly defined search (such as Long Island, N.Y., summer vacation activities), the likelihood of obtaining information of relevance may increase, but may come at the expense of missing out on other valuable information that does not fall within the search scope. Thus, a user frequently must choose whether to invest time parsing through highly generalized information, or whether to constrain a search such that only relevant information is obtained at the expense of foregoing other valuable information.
As is well-known in the art, clustering is a term that is used to describe the process of finding and arranging information in groups. The groups themselves are frequently referred to as clusters, and each member or element of a cluster shares a common property. As is understood in the art, the usage of clusters aids in organizing highly generalized information based on common properties, topics, and themes.
Previous practices have used clustering techniques to present information as a first set of clusters responsive to an initial search engine query. Thereafter, if a user wanted to refine a search, the previous practices implemented what is known in the art as “query refinement,” wherein an entirely new search would be conducted generating a second set of clusters. This method of refining search results proved to be disadvantageous, however, given the disconnect between the first set of search results and the second set of search results, leading to a discontinuous search experience which can confuse the user who desires navigational help in exploring the current set of search results.
Thus, what is needed in the art is an improved technique for clustering search results