The present invention relates to search engines which monitor user behavior in order to generate results with improved relevance.
Search engines are designed to explore data communication networks for documents of interest to a given user and then generate listings of results based on those documents identified in that search. The user specifies this interest by inputting a query, expressed as a “keyword” or set of “keywords,” into the search engine. The keywords are then compared with terms from documents previously indexed by the search engine in order to produce a set of matched documents. Finally, before being presented, the matched documents are ranked by employing any number of different algorithms designed to determine the order with which documents might be relevant to the user. Those documents with the highest probability of being relevant to the user are typically presented first. The objective is to quickly point the user toward documents with the greatest likelihood of producing satisfaction.
On the internet (a popular, global data communication network), due predominantly to improvements in technology and the growth in the quantity of information available, the number of indexed documents has grown rapidly; some queries now return millions of matched documents. As a result, the ability of internet search engines to help users identify documents of particular interest to a given query is hampered. In other words, while internet users have access to an increasing quantity of potentially relevant information, identifying relevant documents by driving queries using only the keywords entered by users has become more difficult.
Many search engines have thus begun employing strategies in an attempt to combat this problem, beyond simply improving the algorithms that rank relevancy. Some of the major strategies consist of things such as focusing on specific vertical segments, using artificial intelligence to perform contextualized searches, leveraging psychographic, demographic and geographic information and mining the search behaviors of previous users. (Using the behavior of previous users to predict the relevancies of documents for future users has been covered by a number of U.S. patents and applications: 2006/0064411 A1 entitled “Search engine using user intent,” U.S. Pat. No. 6,738,764 B2 entitled “Apparatus and method for adaptively ranking search results,” and U.S. Pat. No. 6,370,526 B1 entitled “Self-adaptive method and system for providing a user-preferred ranking order of object sets,” to name a few.) Additional strategies also include leveraging the previous search history of the particular user in order to customize future searches for that individual.
In spite of these new strategies, current retrieval systems continue to be far from optimal. A major deficiency of existing retrieval systems is that they generally lack user modeling and are not adaptive to individual users, and when they do they are not updated in real time. This inherent non-optimality is seen clearly in the following two cases: (1) Different users may use exactly the same query (e.g., “Java”) to search for different information (e.g., the Java island in Indonesia or the Java programming language), but existing Information Retrieval (IR) systems return the same results for these users. Without considering the actual user, it is impossible to know which sense “Java” refers to in a query. (2) A user's information needs may change over time. The same user may use “Java” sometimes to mean the island in Indonesia and some other times to mean the programming language. Without recognizing the search context, it would be again impossible to recognize the correct sense and the user will inevitably be presented with a non-optimal set of search results.
Once presented with such non-optimal set of results, users' options are limited. They can scan page by page through a myriad of potentially irrelevant documents in an attempt to pick out the ones that matter, or they can modify their query by trying to identify additional or more specific keywords in an attempt to produce new, and hopefully more optimal, sets of results. Depending on the nature of the search and the ingenuity of the user, this task can often be painstaking and frustrating, if not impossible.
In order to optimize retrieval accuracy, there is clearly a need to model the user appropriately and personalize search according to each individual user. The major goal of user modeling for IR is to accurately model a user's information need, which is a very difficult task. Indeed, it is even hard for a user to precisely describe his or her information need.
There is therefore a need for search engine technology capable of implicitly modeling the information need of the specific user conducting a search, at the moment that search is being executed, in order to immediately modify the search results “on the fly” with the purpose of ranking the matched documents in the most relevant order possible for the user's query.