The present invention relates to information retrieval, and more specifically, to an approach for presenting search results using dynamic categorization.
Information systems provide for the storage, retrieval and sometimes management of data. Information is typically retrieved from an information system by submitting a query to the information system, where the query specifies a set of retrieval criteria. The information system processes the query against a database and provides data that satisfies the search criteria (search results) to a user.
The form of search results depends upon the context in which a particular search is performed. For example, in the context of a database search, search results might consist of a set of rows from a table. In the context of the global information network known as the xe2x80x9cInternetxe2x80x9d, the search results might consist of links to web pages.
For the purpose of explanation, the specific data items against which a search query is executed are referred to herein as searchable data items. The set of all searchable data items against which a query is executed is referred to herein as the searchable data set. The specific searchable data items that satisfy a particular query are referred to herein as matching data items. The set of all matching data items for a given query are referred to herein as the search results of the query.
Processing a query containing general or generic search terms against a large searchable data set can result in a large number of unorganized matching data items, sometimes referred to as xe2x80x9chits.xe2x80x9d For example, processing a query containing general or generic terms on the Internet can generate millions of hits.
On the Internet, search queries are processed by search tools known as xe2x80x9csearch enginesxe2x80x9d that typically present a sequential list of matching data items ranked by relevance, from most relevant to least relevant. As a result, the matching data items that best satisfy the search criteria are presented at the top of the list, with the other matching data items presented further down the list in order of decreasing relevance. For example, web pages or web sites with web pages that contain the greatest number of the search terms receive the highest relevance ranking and are presented at the top of the list.
Because the search results are presented serially, with approximately ten to twenty hits per page, reviewing a large number of hits, for example several thousand, or even only several hundred hits, is often impractical. This is not necessarily a problem in situations where the relevancy ranking drops off quickly after a relatively few number of hits because a user will typically only view the most relevant matching data items. However, in situations where a large number of hits have a high relevancy ranking, it can be impractical to review all of the most relevant hits.
One alternative approach for presenting search results is the static category approach. The static category approach involves pre-assigning all searchable data items to predefined or xe2x80x9cstaticxe2x80x9d subject matter categories based upon their content When a search is performed, a relatively fewer number of categories that satisfy the search criteria are displayed instead of or, in addition to, the actual matching data items. The members of those static categories (which may or may not satisfy the search criteria) can then be accessed through the categories.
In the context of the Internet, for example, all web pages and web sites containing subject matter relating to the topic of baseball would be statically assigned to a baseball category. When a query containing the term xe2x80x9cbaseballxe2x80x9d is processed, the baseball category is displayed, instead of or in addition to, all of the individual web pages that satisfy the query terms. A user can then select the baseball category to view the web pages and web sites assigned to the baseball category. Categories containing a large number of searchable data items can be divided into sub-categories to create a statically-defined category hierarchy.
Although the static category approach is helpful in allowing a user to navigate through a large number of searchable data items in an organized manner, it suffers from several drawbacks. First, if the amount of information being searched is large, a large amount of resources can be required to pre-assign all of the searchable data items to categories. Furthermore, when the searchable data set changes, the category assignments must be updated to reflect the changes. For example, if new searchable data items are added to the searchable data set and the categories are not updated to reflect the new searchable data items, then a user cannot access the new searchable data items through the categories. As a result, the new searchable data items that cannot be accessed through the categories are effectively lost.
Another drawback to the static category approach is that the statically-defined categories may not be helpful in finding information that does not fit squarely into the predefined categories. Thus, a search may result in the display often categories, where each of the ten categories has a relatively low degree of relevance.
These problems are particularly acute on the Internet for at least two reasons. First, the Internet provides access to a vast amount of information which requires an enormous amount of resources to assign searchable data items to categories. Secondly, the information available through the Internet is constantly changing and new information is being added at an astounding rate. Consequently, a large amount of resources is required to maintain static categories that do not necessarily reflect all of the searchable data set Therefore, based upon the need to present a large number of matching data items in an organized manner and the limitations of prior approaches, an approach for presenting a large number of matching data items in an organized manner that does not suffer from the limitations of prior approaches is highly desirable.
According to one aspect of the invention, a method is provided for presenting search results using dynamic categorization. The method comprises the steps of receiving search results, dynamically establishing one or more search result categories based upon attributes of the search results and presenting one or more category identifiers corresponding to the one or more search result categories.
According to another aspect of the invention, a method is provided for presenting search results on a user interface using dynamic categorization. The method comprises the steps of dynamically establishing one or more search result categories based upon attributes of the search results and displaying on the user interface one or more interface objects corresponding to the one or more search result categories.
According to another aspect of the invention, a computer system is provided for presenting search results to a user using dynamic categorization. The computer system comprises a user interface, one or more processors and a memory coupled to the one or more processors. The memory contains one or more sequences of one or more instructions which, when executed by the one or more processors, cause the computer system to perform the steps of receiving search results, dynamically establishing one or more search result categories based upon attributes of the search results and displaying on the user interface one or more category indicators corresponding to the one or more search result categories.