1. Field of the Invention
The present invention relates to a document retrieval system for retrieving a document from a plurality of documents stored in storage devices in accordance with a retrieval condition input by a user and a document retrieval apparatus, and a control method, a program, and a storage medium therefor.
2. Description of the Related Art
Up to now, when a document desired by a user is searched for from a plurality of documents stored in a large capacity storage device such as a document management server, the user inputs a retrieval condition for retrieving the document and the retrieval can be carried out in accordance with the input retrieval condition. At this time, for the retrieval condition, for example, a part of character string included in the document of the retrieval target can be specified as a retrieval key word or a retrieval logical expression for representing a combination of the key words can be specified.
In recent years, the allowable storage capacity of a storage medium such as the document management server has increased and it is possible to store a large number of documents therein. Also, for example, when a plurality of multi function peripherals (MFPs) provided with a storage device such as a hard disk drive (HDD) are connected on a network, it is possible to retrieve a desired document from a plurality of documents stored on the HDDs of the MFPs. Under these environments, the number of documents to be set as the retrieval target can be significantly large.
Depending on a retrieval condition input by the user, the number of documents included in the retrieval results can be greater than a user expects, and such a problem may occur that the desired document cannot be found. In particular, when a user who is not familiar with the retrieval carries out a retrieval, the user repeatedly inputs retrieval conditions in order to obtain the expected retrieval result and the retrieval needs to be carried out each time, which is time consuming.
In order to solve the above-described problem, a technique is known for reducing the amount of time it takes when a retrieval is carried out in accordance with a retrieval condition input by a user. To be more specific, while the user inputs the retrieval condition and carries out the retrieval, the retrieval is interrupted when the retrieval results exceed a predetermined threshold (refer to Japanese Patent Laid-Open No. 08-314965).
However, the above-described conventional technique has the following problem. That is, according to Japanese Patent Laid-Open No. 08-314965, in a case where the retrieval results are too large, the retrieval is interrupted before the completion of the retrieval and the re-input of a retrieval condition can be performed. However, until the retrieval is started, it is impossible to find out how many documents are included in the retrieval results.
For that reason, irrespective of a possibility that the retrieval is interrupted, the user needs to wait for the progress of the retrieval process. Also, a retrieval engine for actually executing the retrieval process ends up performing a meaningless retrieval process when the retrieval results are extremely large and the user then creates a retrieval condition again. Furthermore, in such an environment that a storage device which stores retrieval target documents or a retrieval engine is located at the outside while being connected via a network, information about a retrieval request, a retrieval result, etc. needs to be repeatedly sent on the network. This can result in placing a significant load on the network.