The present invention relates generally to text searching and more particularly to a method and system for improving text searching.
The majority of text searching algorithms is based on analyzing the content of documents. Conventional text searching algorithms only evaluate each document individually in a type of competition to see which document makes the top of the list. For example, Yahoo.com searches within categories. Other web sites, such as Alta Vista, etc., offer similar services. When a user asks a query, he/she is looking for a small set of documents that provide an answer. Text queries tend to provide large answer sets and a one-size-fits-all relevancy ranking. These text searching algorithms typically include extracting words or phrases, creating indexing structures, and determining discriminators for calculating relevance. When a user asks a text query, the index identifies the candidate documents, the relevance is calculated for each document, the candidate documents are ordered by relevance, and the resulting list is returned to the user.
This is useful to a user when the list of candidate documents is relatively small. When the list becomes larger, other means of manipulating the list are needed. Why? Even though the relevance ranking tries to give a good order to the list, it may not be close to the criteria that user has in mind. Another source of imprecision is that a word submitted in a text query can have multiple meanings. A search for xe2x80x9cjackxe2x80x9d can yield results for card games, bowling, a children""s game, fish, rabbits, etc. There are over 15 definitions of xe2x80x9cjackxe2x80x9d (http://www.dictionary.com/cgi-bin/dict.pl?term=jack). A large list requires refinement to factor out the candidate documents which do not match the user""s criteria for selection.
Accordingly, what is needed is a system and method for improving the text search for documents. The present invention addresses such a need.
A method and system for improving text searching is disclosed. The method and system provides a network of document relationship and utilizes the network of document relationships to identify the region of documents that can be used to satisfy a user""s request. In a preferred embodiment, the text searching method in accordance with the present invention augments a conventional text search by using information on document relationships. The text searching method and system improves upon conventional text search techniques by incorporating relationship metadata to define regions to search within. In the present invention is the definition of a region is not limited to just categories as it includes neighborhoods around individual documents and sets which have been user defined.