1) Field of the Invention
The present invention relates to an information retrieving method, information retrieving system and retrieval managing unit for this system which are for retrieving or searching and fetching necessary information from a database retaining various kinds of information.
2) Description of the Related Art
Recently, the quantity of documents converted into electronic form has rapidly been increasing with the progress of computer networks such as the Internet and the intranet. Accordingly, a service based upon an information retrieving system has developed to derive necessary information from these documents.
For instance, a system shown in FIG. 8 has been known as one of prior information retrieving systems. This information retrieving system shown in FIG. 8, designated generally at numeral 100, is made up of a retrieval managing server 101, a plurality of (4 in FIG. 8) retrieving servers 102 and a database 103.
The retrieval managing server 101 is for managing a retrieving operation in the plurality of retrieving servers 102, and in response to a retrieval request from a client (not shown), gives an instruction thereto for retrieval from the database 103. Each of the retrieving servers 102 is constructed to have a sequential or serial retrieval engine (not shown), and is made to conduct the retrieval from the database 103 in accordance with the instruction from the retrieval managing server 101, before forwarding or returning the retrieval result to the retrieval managing server 101. In the actual arrangement, the database 103 is held in a storage unit such as a disk unit.
In the information retrieving system 100 thus arranged, for the retrieval, the plurality of retrieving servers 102 gain access to the database 103 existing in one large area in parallel relation to each other (in a simultaneous access way). Usually, as compared with an operating speed of a CPU of a processor or a memory constituting each of the retrieving servers 102, the operating speed of the storage unit retaining the database 103 is considerably lower. For this reason, in the case that the plurality of retrieving servers 102 conduct the retrieval from one database 103 as mentioned above, each of the retrieving servers 102 frequently goes into a wait condition for the status of the storage unit, which results in lowering the retrieval efficiency.
Therefore, for eliminating this problem, there has hitherto been proposed an information retrieving system 200 shown in FIG. 9. This information retrieving system 200 is, as well as the above-described information retrieving system 100, composed of a retrieval managing server 201 and a plurality of (4 in FIG. 9) retrieving servers 202A to 202D. In addition, in this information retrieving system 200, the database undergoing the retrieval is divided into four partial aggregations corresponding to the number of retrieving servers 202A to 202D, with the four sections being coupled as databases 203A to 203D with the retrieving servers 202A to 202D, respectively.
In this case, the retrieval managing server 201 is for managing the retrieval operations in the retrieving servers 202A to 202D, and in response to a retrieval request from a client (not shown), gives an instruction to each of the retrieving servers 202A to 202D for the retrieval from the corresponding one of the databases 203A to 203D. In addition, the retrieving servers 202A to 202D independently accomplish the retrieval from the divided databases 203A to 203D in accordance with the instruction from the retrieval managing server 201, respectively. In the actual arrangement, the databases 203A to 203D are held in a storage unit such as a disk unit.
With this arrangement, in this information retrieving system 200, the plurality of retrieving servers 202A to 202D conduct the retrieval from the divided databases 203A to 203D in an independent/parallel manner, respectively, which reduces the occurrence of the storage unit wait condition so that the retrieval efficiency is improvable.
Meanwhile, in recent years, the need for the improvement of the information retrieving performance has increasingly been occurring with the enlargement of the above-mentioned networks, and therefore, it is strongly desired that an information retrieving system which can meet the need for the improvement of the information retrieval performance appears successfully.
The aforesaid information retrieving system 200 can enhance the retrieval performance by conducting the processing called fine-grain processing.
Like the information retrieving system 200, in the case that the parallel processing is done through the use of a plurality of retrieving servers (processors) 202A to 202D, for enhancing the processing performance, it is preferable to equalize the load balances among the plurality of retrieving servers 202A to 202D. That is, the condition that all the retrieving servers 202A to 202D always takes charge of the same quantity of retrieval processing contributes to the highest retrieval efficiency. However, usually, there is almost no case that the quantity of retrieval processing is equally distributed to the retrieving servers 202A to 202D.
Accordingly, in a manner of conducting the fine-grain processing to more finely set the unit of the retrieval processing by the retrieving servers 202A to 202D, the load balances are equalized among the retrieving servers 202A to 202D, so that the retrieval performance is improvable.
More specifically, when receiving a retrieval request from a client, the retrieval managing server 201 finely divides, in a predetermined unit, the data to be retrieved (which will be referred hereinafter to as retrieval data) within each of the databases 203A to 203D respectively coupled with the retrieving servers 202A to 202D, and successively allocates non-processed retrieval data to the retrieving servers 202A to 202D which complete the retrieval processing of the retrieval data in the predetermined unit. Whereupon, the load balances among the retrieving servers 202A to 202D are made equal, thereby sharply heightening the retrieval efficiency.
However, the above-mentioned fine-grain processing must require the communications among the retrieving servers 202A to 202D as indicated by two-dot chain lines in FIG. 9 because there frequently occurs a case that the retrieving server (processor) the retrieval data belongs to differs from the retrieving server (processor) which conducts the retrieval processing of that retrieval data, thus causing a large amount of overhead.
In addition, if conducting the fine-grain processing, the retrieval managing server 201 is required to always grasp the internal processing status of a sequential retrieval engine body constituting each of the retrieving servers 202A to 202D which form the base of the information retrieving system 200. Thus, difficulty is experienced to directly use the sequential retrieval engines without modifying or changing them, that is, the internal arrangement of each of the sequential retrieval engines needs to be modified in parallelizing the sequential retrieval engines.
Accordingly, for constructing an information retrieving system, a detailed knowledge about the sequential retrieval engine body becomes necessary and a large number of steps must be performed for the sequential retrieval engine parallelizing processing and the fine-grain processing, so that it takes very much time to develop the information retrieving system.
Moreover, for this reason, even if a sequential retrieval engine is newly developed as the base of an information retrieving system, difficulty is encountered to directly or immediately introduce the new sequential retrieval engine into the information retrieving system, with the result that the information retrieving system can not catch up with the improvement of the performance of the sequential retrieval engine.