1. Field of the Invention
The present invention relates to a technique for efficiently searching documents by merging a plurality of query formulas in document search.
2. Description of the Related Art
In order to stably provide a document search service, it is required to return a search result within a certain time period even at service peak times. At the service peak times, a plurality of query formulas arrive at a search server one after the other. As a method for processing the plurality of query formulas, a method for sequentially processing the query formulas from the one that has arrived first (a sequential processing method) and a method for processing the plurality of query formulas in parallel by use of a time sharing function of an OS (a parallel processing method) have been well known. However, in each of the above methods, as the number of query formulas to be processed at one time is increased, a response time for each of the query formulas is also increased in proportion to the number. In the parallel processing method, simultaneous processing capability is enhanced if more than one CPU is used. However, when the number of query formulas to be processed at one time exceeds several tens, processing slows down.
In light of the above problems, a method in which a plurality of query formulas are merged by an OR operation, and merged search is performed by using the resulting formula (a merged processing method) (U.S. Pat. No. 5,454,105) has been proposed. The merged processing method varies in performance depending on the document search method on which the merged processing method is based. Here, a search method in which the search is performed while scanning a document from the beginning (scan-type search) is considered as an example. In scan-type search, while the same document is repeatedly scanned more than once in the sequential processing method or the parallel processing method, the document is scanned only once in the merged processing method. However, since the query formulas are merged by the OR operation, it is required to check afterwards which one of the query formulas hits a certain document. Still, the processing can be speeded up compared with the case where the document is scanned more than once.