1. Field of the Invention
The present invention relates to a system, a method, and an apparatus for conducting a search in accordance with given criteria across horizontally-divided distributed databases in which a number of data items are divided into clusters in such a manner as to each have a certain number of data items, and each of the clusters is entered and processed in one of the databases connected via a network.
2. Description of the Related Art
Recent technological advancement has realized a structured text database in which structured text information described in a language such as the Extensible Markup Language (XML) is stored and searched for. In most cases, a query language called XQuery (XML Query) that is being standardized by the W3C (World Wide Web Consortium) is used to make an inquiry to the structured text database.
Unlike the Structured Query Language (SQL), which is a standard query language for a relational database (RDB) designed for data management in a table format, the XQuery search process targeted at XML data deals with list-structured sequence data as an interim result.
Meanwhile, a horizontally-divided distributed database system is well known, in which a number of data items are divided into clusters so that each cluster has the same number of items, and each of the clusters is entered and processed in one of the databases connected to one another via a network. When a search is conducted on such a distributed database system by use of XQuery, sequence data that serves as an interim result may be scattered as partial results (partial sequences) across different databases (physical DBs). Therefore, the partial sequences need to be dealt with as a logical sequence in which the partial sequences are logically integrated.
For an object database, a data management technology is known to manage structures similar to that of a logical sequence composed of partial sequences (as can be seen in Japanese Patent No. 2827562). In this technology, an ID is assigned to each independent partial set, and a set of assigned IDs is prepared. Then, an ID is assigned to the prepared set, and thereby management information hierarchically including sets of sets is obtained to manage the structures.
However, in the logical sequence management according to a method as suggested by Japanese Patent No. 2827562, management information needs to be referred to every time when forming or referring to a sequence, or conducting any other sequence-related operation in general. That is, excess overhead is produced from a process related to partial sequences, which is unnecessary when a single database is targeted at.