1. Field of the Invention
The present invention relates to a parallel database processing system having a plurality of horizontally partitioned partial tables, and more particularly to a parallel database processing system suitable for managing indexes of primary and secondary keys of tables, and further to a retrieval method using the secondary key.
2. Description of the Related Art
The technical development of parallel computer systems has allowed parallel database systems to gain practical use. Recently, the research and development of parallel data systems operating on shared-nothing parallel computers such as Bubba, Teradata DBC/1012, GAMMA, MDBS, and Tandem NonStop SQL are now progressing, for example, as described in "Principles of Distributed Database" by M. Tamer Ozsu and Partick Valduriez, published by Prentice-Hall International Inc, Sec. 15 at page 466.
In a conventional parallel database system, one table having a number of records is divided into fragments (hereinafter called local tables) each having a plurality of records, and the local tables are distributed to a plurality of processors enabling the parallel processing of the database. Such division of a table is called horizontal partition, and a key for identifying each local table is called a primary key.
A parallel database processing system is configured by a global database management system (hereinafter called GDBMS) for managing the entire table, and local database management systems (hereinafter called LDBMS) for managing local tables.
FIG. 7 illustrates a table management method used by a conventional parallel database processing system.
As described above, a table is partitioned into a plurality of local tables 26. Each local table 26 is stored in the LDBMS of each local database processing apparatus 13. Each LDBMS has primary key indexes 25, each indicating correspondence between the record position of each record in the local table 26 and each primary key. The primary key is a record identifier capable of definitely identifying each record. An employee number is illustratively used as the record identifier in this example. For the retrieval using a secondary key to be described later, each LDBMS also has local secondary key indexes 52, each indicating correspondence between the location of each record and each secondary key. All secondary keys are partitioned into plural sets of local secondary keys similar to the primary keys in the respective LDBMSs.
GDBMS of a global database processing apparatus 11 has a primary key partition (fragmental) table 21. This table 21 stores a primary key group and its corresponding LDBMS number. By referring to this table 21, it is possible to know in which LDBMS the record represented by the primary key is stored. To implement the primary key partition table 21, a B-tree, a hash table or function may be used.
In this system, the primary key or secondary key can be used to select a local table and identify a target index.
A table selection query with the primary key is processed while referring to the primary key partition table 21 stored in GDBMS and the main key indexes 25 stored in each LDBMS. In this case, the target LDBMS having data to be retrieved can be selected while referring to the primary key partition table 21 in GDBMS, and a table selection query is issued only to the target LDBMS.
How a table selection query with the secondary key is processed in this system will be described next with reference to FIG. 6. In the case of the selection query with the secondary key, it is uncertain which group of local secondary key indexes 52 contains the secondary key in concern. Therefore, GDBMS 50 of the global database processing apparatus issues a selection query to all LDBMSs 51 of the local database processing apparatuses (indicated at 501 in FIG. 6). Each LDBMS searches the concerned secondary key while referring to the local secondary key indexes 52 (indicated at 502 and 503). LDBMS 51 storing the concerned secondary key accesses its local table 26 (indicated at 504) to locate the target record (indicated at 505).
In the conventional parallel database system, a selection query using the secondary key is issued to all LDBMSs having a local table of the table from which data is retrieved. It is therefore necessary for GDBMS to broadcast a selection query to all LDBMSs via a communications network 12 interconnecting all processors of LDBMSs, and it is further necessary for all LDBMSs to return the responses to GDBMS. This broadcasting hinders greatly other message communications on the network. In addition, the need for processing the selection query at all LDBMSs is one contributing factor in lowering the throughput of a conventional parallel database system. These drawbacks become conspicuous as the number of LDBMSs increases.
Furthermore, a conventional parallel database processing system uses proprietary LDBMSs and interfaces between GDBMS and LDBMSs. There still is not known a parallel database system configured by standard LDBMSs, particularly a parallel database system configured by different types of LDBMSs interconnected by a network.