Frequently people attempting to retrieve information from, e.g., an information database, are confronted with the problem of being provided more information than they need, want and/or are capable of reviewing in a reasonable amount of time. This problem is sometimes referred to as "information overload".
Examples of searches which can result in information overload include, e.g., Internet searches for sites which may include information about a particular topic of interest. Database searches for television shows, songs, or movies that a user might be interested in are additional examples of searches that may result in an excess of information.
Information overload is the result of the ever increasing size of modern databases and the difficulty associated with searching and efficiently retrieving desired information from a database. Desired information may include, e.g., the information that will be the most useful, relevant and/or interesting to a particular individual database user. Generally, information that is already known to the user, although potentially responsive to an information retrieval request, is of little value because it presents the user with no information that is new to the user. Accordingly, known information that is included in a list of search results may be thought of as unwanted "noise" which merely distracts the user from more useful material and/or wastes the user's time.
Various known information retrieval systems attempt to avoid the problem of information overload by performing a ranking, e.g., result prioritization, operation as part of an information retrieval operation. In such systems retrieved information is often ranked in some order which is intended to approximate how useful, interesting, and/or responsive the information is likely to be to the system user. Unfortunately, ranking of search results fails to address the noise problem resulting from the inclusion of known information in the search results.
In order to partially address the problem of including known information in search results, some information retrieval systems conduct searches in a manner that avoids including information in the search results that the user has already explicitly indicated are known to the user. Unfortunately, this approach frequently has little impact on the search results since it is difficult and often impractical for a user to identify, prior to a search, all or most of the database entries which are known to the user.
While the use of ranking of search results and the elimination of items which a user has explicitly indicated as being known to the user has helped to avoid or reduce the problem of information overload to some extent, people searching databases for information continue to be confronted with large amounts of data that can be time consuming to review. Furthermore, as databases continue to grow in size, the problem of a user being provided too much information to review in a reasonable amount of time is becoming an ever increasing problem.
Accordingly, there is a need for methods and apparatus which can be used to improve the results of ranking operations performed by existing information retrieval systems. In addition, there is a need for new systems which are more effective at selecting and/or prioritizing information to be presented to a user in response to an information retrieval request, e.g., a database search request. Furthermore, it is desirable that such methods and apparatus include features for taking into consideration a particular user's existing knowledge to reduce the risk of providing, or assigning a high ranking to, information which is already known to a user and which is therefor of little value.
It is also desirable that such methods and apparatus be capable of being used with a wide variety of existing information retrieval systems and search engines without requiring modifications thereto.