The present invention relates to computerized text searching. In particular, the present invention relates to refining search results.
One of the most popular uses for computer systems is the storage and retrieval of large amounts of data. To retrieve data, especially text-based data, users either search through an index of the data to locate the data they desire or submit a search query to a computerized search tool that searches through the data based on the parameters of the query.
Typically, search queries take one of three forms. In one form, the user supplies stand-alone terms without using connecting terms to show the relationship between the stand-alone terms. In a second form, the user connects the terms in the query using logical operators such as the Boolean operators AND, OR, and NOT, or pseudo-Boolean proximity operators such as NEAR, and W/x (within x words). In a third form, the user simply types their search goal in natural language.
In all of these forms, it is often difficult for users to precisely state their search goals. As a result, users"" queries often generate results with poor precision by returning many documents that are irrelevant to the users"" search. This is especially true of searches of the Internet.
One cause of poor precision is that users fail to indicate that the terms of the search query should appear in the same sentence. Although such a limitation is implicit in many search queries, users do not know how to construct their query to implement such a limitation. One major cause of this problem is search tools that do not support same-sentence logical operators. However, even when using search tools that support same-sentence logical operators, users often fail to include the same-sentence restriction in their search.
A method of computerized searching receives parameters of a search query from a user and adds a restriction to the parameters to require that at least two of the search terms of the search query appear in a same sentence in a document. A representation of a set of documents is then searched based on the parameters of the search query and the added restriction. Documents that meet the search parameters and the added restriction are thus identified.
In another embodiment of the invention, a computerized search tool searches a representation of a set of documents based on the parameters of a user""s search query. Based on this search, documents that meet the parameters of the search query are identified with preference given to those documents that have at least two terms from the search query in a same sentence.