The present invention relates generally to the areas of computerized methods of information extraction from text documents and databases, statistical text analysis, keyword searching, and internet/web searching.
Developing technology capable of meeting individual information needs is challenging because the full context of a person's expertise is hard to summarize in the query presented to the system. There are several factors beyond the volume and type of information encountered in the past that contribute to a person's expertise. One is how well they digested and understood the information. Another factor is what was driving their interest in digesting the information in the first place. If different people encounter the same information with different motivations, they may acquire different types of expertise.
Current methods employed for helping people meet their individual information needs sometimes uses tools based on cognitive psychology to understand how a person interacts with data and how reading that information affects their behavior during information search.
One common way to help people find surprising information on the World Wide Web is the search engine, of which Google™ is one of the most popular examples. To use a search engine, the person conveys directly to the computer system their information need via a keyword query and the computer retrieves documents that contain those terms. In this case, the computer neither understands the user's needs nor detects what the user would find surprising. The user is providing the information about what would be relevant and hoping that the search results will also contain the information desired. A skilled search engine user can use a combination of keywords that both convey the general area of interest and the subset of documents in that area that is likely to include the desired information.
In contrast to a monolithic search engine like Google™, the field of personal information retrieval uses additional techniques employed to try to understand what would be of interest to that specific person. Two techniques commonly employed will be mentioned here. The first method (i.e., information retrieval systems) is to generate statistics based on some background corpus of information that matches a person's domain expertise. In this case, a document is considered likely to contain relevant information if its terms meet some criteria related to the statistical composition of the corpus. A second method (also known as collaborative filtering) operates with implicit or explicit feedback from either the user or other users who are deemed to be similar. For example, on Amazon.com, when someone purchases a book, they are presented with a window indicating that other shoppers who purchased the same book also purchased some other specific books. The assumption is that two people who buy the same book are likely to find relevant information in other books that each one of them has read individually.
Against this background, the present invention was developed.