Search engines are a commonly used tool for identifying desired documents from large electronic document collections, including the world-wide internet and internal corporate networks. Conventional search methods typically use keyword searches to identify relevant documents. Documents that match more keywords within a search are often considered more desirable. These documents are typically returned at the beginning of the list of search results.
One limitation of keyword searching is the difficulty in providing a context for the keywords. For example, consider a search query containing the word “pizza.” Documents that typically contain this word also have other words in common such as “delivery”, “pepperoni”, “sauce”, “restaurant” etc. However, it is quite possible that there are documents that contain the word “pizza” prominently, but have nothing to do with the more common use of the word pizza. For instance, a new software technology called “pizza” might be invented by a startup and, therefore, be featured prominently on that companies web page. If this invention is new and not well known then this use of the word “pizza” will not be the likely intent of users when they enter the query pizza, so the results for this search query should not feature this page prominently. Unfortunately, a conventional search engine does not have the ability to distinguish between the new, uncommon usage of the word “pizza” and the usage that is probably desired by the person submitting the search query.
One way to provide context for a keyword search is by adding additional keyword search terms. However, the person submitting a search query may be either unwilling or unable to add enough keywords to provide context for the search. Additionally, simply adding one or more keywords may not adequately represent the true content a user is interested in finding.
In a paper titled “Self Organization of a Massive Document Collection”, (IEEE Transactions on Neural Networks, Vol. 11, No. 3, May 2000, page 574), a method is provided for constructing a self-organized 2-dimensional map to categorize documents. The categorized documents can be keyword searched. Additionally, the individual map units are indexed based on any keywords contained within the map unit.
What is needed is a system and method of performing keyword searches that incorporate a user's likely interests. The search system and method should be able determine a user's likely interests based on past activity by the user. Based on the user's interests, the search system and method should be able to provide search results sorted to match the likely intended context for a search while maintaining a response time similar to the response times of conventional search methods. The system and method should also be able to store the information regarding a user's interests in a compact manner. Additionally, the system and method should be compatible with conventional search techniques.