1. Field of the Invention
This application relates to the field of telecommunications and more particularly to the field of on-line query tools.
2. Description of Related Art
Search queries using terms that appear with high frequency in retrieved documents and/or in user queries present special problems for information retrieval software. Because information retrieval software typically relies on the use of lists for each search term that include document identification information that corresponds to each document in the searched database that includes the term, frequently used terms may require substantially more processing time than rare terms. The processing load is compounded by the fact that frequently used terms are also the subject of a disproportionate number of user queries. The problem can be especially acute when more than one high frequency term is used in the same query, such as a query for xe2x80x9cBoston Restaurants.xe2x80x9d
One way to improve information retrieval techniques would be to perform an intersection of sets and perform a ranking of the related categories (e.g., Italian restaurants in Boston, French restaurants in Boston, etc.) or related listings (for specific Boston restaurants). Because the term list 836 for documents containing the term xe2x80x9cBostonxe2x80x9d (including all businesses in Boston) and the term list 836 for documents containing the term xe2x80x9crestaurantxe2x80x9d (including all restaurants, nationwide) are both very large, the processing involved in retrieving each list and performing an intersection in order to identify matching categories or documents can be substantial.
Accordingly, a need exists for information retrieval and storage techniques that improve processing in situations where high frequency terms are used. Such techniques are referred to herein as xe2x80x9ccommon term optimization.xe2x80x9d
Provided are methods and systems for providing improved searching in an on-line query tool. The methods and systems provide an interface by which a user may enter a user query having a term. Common terms are identified, and a special result set is pre-classified for such common terms. The result set may be located in a special place in memory. User queries may then be parsed for use of common terms and directed to the special location in memory, thus enhancing the speed of searches that use common terms.
Identification of common terms may include identifying combinations of words wherein at least one of the words in the combination is used frequently in user queries and wherein the words in the combination appear frequently together. Pre-classifying the result set may include establishing a plurality of linked lists of business listings, each linked list corresponding to a common term and including business listings that include the common terms. The business listings or the linked lists may be expanded to include terms that are related to the common terms, such as synonyms.