In the offices of companies and public agencies, networking and digitization of the office environment are progressing. Now, various documents are stored as electronic documents using software such as wordprocessor software, spreadsheet software, or presentation software. Also, paper documents are converted into electronic documents by a device such as a scanner, and the electronic documents are stored.
Hence, many electronic documents must be stored in the office, and many offices introduce a document management system to realize efficient management of electronic documents (see, e.g., Japanese Patent Laid-Open No. 2000-315210).
However, the number of documents processed in one database is limited, and when the number of registered documents increases, the database is divided into a plurality of databases to manage documents. Since the search of many documents decreases search efficiency (requires a long search time), a distributed database environment must be prepared. As an initial operation policy of the document management system, a database may be divided for each department or each type of document. Thus, it should be taken into consideration that a plurality of databases are searched for an electronic document in the document management system.
As the document management system which searches a plurality of distributed databases, there is proposed a system shown in FIG. 3. In FIG. 3, reference numerals 301 to 303 denote document management servers; and 305, a client. The document management servers 301 to 303 and the client 305 are connected to each other via a network 304. In FIG. 3, the document management system and the database are in one-to-one correspondence for descriptive convenience, and no volume server is illustrated.
In the document management system, when the client 305 issues a search instruction, each document management server searches a database connected to it, and the client 305 presents, to the user, a set of search results from the document management servers as final search results. However, search of a plurality of databases by using this search method poses the following problems.
When the document management system is so designed as to sequentially search the databases of respective document management servers, the time taken for the search process increases depending on the number of databases to be searched. This problem can be avoided when document management servers 301, 302, and 303 are so designed as to parallel-search their databases. In this case, however, the process must wait for a search result from a database which requires the longest search time, and shortening of the process time is limited.
When different scoring criteria (scoring algorithms) are used for search between databases, no accurate ranking (final search results) can be obtained by simply concatenating final search results in accordance with their scores in displaying the final search results on the client 305. In addition, when the number of databases is large, the number of final search results also becomes large (for example, in an environment where 10 databases are connected, when 100 search results are received from each database, 1,000 final search results are displayed).
These problems occur when the conventional search method is used for searching a plurality of databases. That is, first, the process time becomes longer as the number of databases to be searched increases. Second, no accurate results can be obtained as final results. Finally, the number of final results increases as the number of databases to be searched increases.