1. Field of the Invention
This invention relates to a join processing system and method which joins relations based on join fields in a relational database.
2. Description of the Related Art
FIG. 22 and FIG. 23 are block diagrams showing a related art join processing system In a relational database, which is described in Published Japanese Patent Application Hei 3-156571, for example. In FIG. 22 and FIG. 23, a master processor 1000 processes information and controls slave processors 30a-30c. Slave disk drives 40a-40c are connected to the slave processors 30a-30c, and relations 50a-50c and 60a-60c are classified and stored respectively in the slave disk drives 40a-40c in record units. Join fields F1 of the relations 50a-50c and 60a-60c are shown as 70a-70c and 80a-80c, respectively. Other join fields F2 of the relations 50a-50c are shown as 9a-9c and still other join fields F3 of the relations 60a-60c are shown as 10a-10c. Each of record addresses of the relations 50a-50c and 60a-60c is shown as 11a-11c and 12a-12c. Each of the record addresses has a number which identifies the slave processor by its first digit. Address tables 13a-13c and 14a-14c store the join fields F1 (70a-70c and 80a-80c) and the record addresses 11a-11c and 12a-12c which are retrieved from the relations 50a-50c and 60a-60c. The address tables 13a-13c and 14a-14c are sorted according to the values in the join fields F1 (70a-70c and 80a-80c). Each of the address tables 13a-13c and 14a-14c is transferred from the slave processors 30a-30c to the master processor 1000. The master processor 1000 merge-sorts the contents of the transferred address tables 13a-13c and 14a-14c in accordance with the values in the join fields F1 (70a-70c and 80a-80c), and generates each of the address tables 15 and 16. As for address tables 17a-17c and 18a-18c, the join fields F1 (70a-70c and 80a-80c) are compared with each of the address tables 15 and 16 to search record addresses of records which satisfy a join condition. The searched record addresses are sorted in accordance with numbers of the slave processors in the record addresses, and the address tables 17a-17c and 18a-18c are generated. The address tables 17a-17c and 18a-18c are transferred from the master processor 1000 to the slave processors 30a-30c. The slave processors 30a-30c search the relations 50a-50c and 60a-60c in the slave disk drives 40a-40c with reference to the record addresses 11a-11c and 12a-12c in the address tables 17a-17c and 19a - 18c. Sets of records 19a-19c and 20a-20c are read from the relations 50a-50c and 60a-60c and satisfy the join condition. As for sets of records 21 and 22, the sets of records 19a-19c and 20a-20c which satisfy the join condition in the slave processors 30a-30c are transferred to the master processor 1000 and merged to display the sets of records 21 and 22. A set of records 23 is obtained by join processing.
The join processing system and method of the related art stores the relations in the relational database in each of the slave disk drives 40a-40c, respectively, in record units. The join processing system and method retrieves the join fields of the joining relations from each of the slave disk drives 40a-40c. Then, the join fields of the joining relations are sorted in each of the slave processors. Both of the retrieve processes and the sorting processes are performed in parallel in the slave processors 30a-30c.
Therefore, in this related art system, each of the slave processors 30a-30c reads the join fields from the slave disk drives 40a-40c in parallel for each of the joining relations in the relational database, and creates address tables which consist of the record address and the join fields including the numbers of the slave processors 30a-30c. The slave processors sort the contents of the address tables In accordance with the values in the join fields and transfers the address tables to the master processor 1000. The master processor 1000 selects the record addresses of the records which satisfy the join condition based on the address tables, which are transferred from the slave processors 30a-30c, and transfers the result back to the slave processors 30a-30c. For each of the relations which are joined, each of the slave processors 30a-30c reads necessary fields in the joining records from the slave disk drives 40a-40c, according to the record addresses which are transferred from the master processor 1000. Each of the slave processors 30a-30c transfers the join fields and the sets of records, which consist of the necessary fields, to the master processor 1000. The master processor 1000 performs join processing based on the sets of records, which are transferred from slave processors 30a-30c.
In a relational database which operates in a multiprocessor join processing system of the related art, a master processor first requests data of a first table and then data of a second table to a slave processor. The master processor performs the checking of the data with a join key, discards unnecessary data, and joins the remaining data. In this system and method, the second table includes unnecessary data originally, thus resulting in a low processing efficiency.
Also, in the relational database which operates in the multiprocessor system, the master processor requests the slave processors to retrieve and transfer the join keys and record addresses of two joining tables to the master processor initially, and the master processor checks the record addresses with the join keys. The master processor requests the slave processors to retrieve necessary records and joins the retrieved records. In this related art system, all the join keys and record addresses of all the records are sent to the master processor. Therefore, the master processor has to perform the checking of all the records with the join keys.
Additionally, the master processor retrieves the table data through each of the slave processors, and performs the checking of the table data with the join key, and the master processor joins the data. Therefore, an advantage offered by the parallel operations in the slave processors is limited because of the centralized data processing of the master processor.