Recent advance in mass storage technology and data base management techniques has increased the feasibility of storing vast amount of documents in data processing systems, thereby providing an opportunity of utilizing the processing power of these systems in facilitating the retrieval of the stored documents.
Descriptions of representative prior art search systems can be found in C. Faloutsos, "Access Methods for Text", ACM Computing Surveys vol. 17, no. 1, March 1985, pp. 49-74; D. Tsichristzis et al, "Message Files", ACM Trans. on Office Information Systems, vol. 1, no. 1, January 1983, pp. 88-98; and G. Salton, "The SMART Retrieval System--Experiments in Automatic Document Processing, Prentice Hall, 1971.
Search commands provided by most prior art text search facilities typically include a set of query words, together with some specifications defining their contextual relationships. A library document is retrieved if it contains words that are identical or equivalent to the query words and the occurrences of which satisfy the specified relationships. In these prior art search facilities, equivalent words are usually given the same degree of significance (weight). The contextual relationships are, basically, defined only in terms of Boolean logic and adjacency operators. As a result, the flexibility provided by these facilities for expressing a desired search content is usually very limited so that a given search may not be as accurate as one would desire.
An object of this invention is to provide a text search facility that allows users to more accurately and flexibly define the scope within which documents with alternative expressions of a desired content will be retrieved; by allowing its users to assign different weights to equivalent words, and by allowing its users to define structures of words which the users consider acceptable. To further enhance flexibility, it is another object of this invention to provide a search facility that can evaluate, based upon user-provided criteria, a value to represent the relevance of a document. Moreover, since the relevance of a document depends on application, user and temporal factors, it is a further object of this invention to allow users to specify how relevancy is to be measured, and also to provide a search facility whereby located documents can be ranked in accordance with their respective relevance.