1. Field of the Invention
The present invention relates to a data processing system, and more particularly to a method of storing data base and file access processing for use in a relational data base management system.
2. Description of the Prior Art
Before going any further, it may be a help for the reader's better understanding of the invention to give a general review on the conventional art of relational data base management. In an art of relational data base management system, a data base is in the form of a collection of tables as typically shown in FIG. 6. An individual table is commonly called a relation 11, each of items in a table is called an attribute 18, and a record loaded actually with data in an attribute is called a tuple 17.
Referring now to FIG. 7, there is shown a general scheme of relational data base management system, which is operable in a system wherein a plurality of data processing units 28a-28d are connected through a network 29 to a plurality of disc storage units 2g-2j, and wherein one relation is stored in fragmentation into the plurality of disc storage units 2g-2j in such a manner that the plurality of data processing units 28a-28d may in parallel have access to any data contained in the relation as stored in these disc storage units 2g-2j. In this scheme, it is called horizontal partitioning to have a single relation partitioned horizontally by way of tuples.
According to this horizontal partitioning, in a case that a relation to be stored has no clustered indexes, it is possible in practice to store the relation in such a manner that each of disc storage units may have generally even tuples, if it is arranged to put tuples to be stored into a disc storage having currently a smallest number of tuples stored.
With a relation partitioned evenly as noted above, the plurality of data processing units 28a-28d may read-in data from each of disc storages 2g-2j in a generally even time interval, and with this arrangement, there will occur no occasion such that a certain data processing unit has not finished reading data, while others have already read data therein during a data handling operation, which may consequently result in an increased speed of data processing, accordingly.
This is a typical construction of the conventional horizontal partitioning in a relational data base management system, and this is of a partitioning system which is essentially adaptable to a relation having no clustered indexes. In this respect, therefore, this partitioning system cannot be adapted to the partitioning of a relation having clustered indexes, and for this reason, it is a problem such that a processing of high speed access to certain tuples in a relation cannot be attained in practice by taking advantage of the clustered indexes.
Next, reference is made to FIG. 8 which shows schematically a typical conventional data processing unit. In this figure, there are shown an electronic computer or main frame designated at the reference numeral 15, a disc storage unit at the reference numeral 16 connected operatively to the main frame, a relation at 11 contained in a data base connected to the disc storage 16, and a clustered index at 12 attached to the relation 11. This cluster index 12 is, for instance, of the type as disclosed in J. D. Ullman's "Principle of Database Systems", paragraph 2.4; issued from Computer Science Press Inc. (Japanese translation: "Database system no genri", translated by Toshiyasu Kunii (phonetic) issued from Nippon Computer Kyokai; p. 71, line 15 through p. 79, line 17). While no particular reference is made to as the clustered index in this literature, what is stated as "B-tree" is obviously of the clustered index. In general, it is arranged that tuples in a relation 11 are sorted in accordance with a key number particular to a clustered index 12 so as to be stored in a disc storage unit 16. In FIG. 9, there is shown an example wherein clustered indexes are given to attributes (keys) having an integer number ranging from 1 to 1000. The relation 11 may be sorted in accordance with a given key number, partitioned into eight pages 13a-13h, and stored into the disc storage 16. A pointer 14a is given a number of a page with a pointer 14b stored therein, and the pointer 14b is given page numbers 13a-13h in the relation 11, respectively. With such arrangement, when a specific key number for a tuple is specified, the page number in which that tuple is stored may immediately be known by referring to the pointers 14a and 14b.
According to the relational data base as reviewed above, all data involved can then be managed in a form of table. This table is referred to as a relation. Each data as contained in each of rows in a relation is called a tuple. Also, each of items (columns) of a relation is called an attribute. According to an example shown in FIG. 6, shown are one relation designated at the reference number 11, tuples at 17a-17j, and attributes designated at 18a-18d, respectively. In a relational data base, it is common in practice that a processing may be directed to a group of tuples which are defined in their range specified in connection with certain specified attributes or with a combination of attributes (hereinafter referred to as "keys"). For instance, in the relation 11 shown in FIG. 6, this is a case of processing such that an average of the attribute 18d "Ages" is to be obtained with the values "General Section" under the attribute 18b of "Name of Section". In this example, the attribute 18b "Name of Section" is a key.
According to the conventional data processing unit as typically shown in FIG. 8, when a processing is performed on a cluster of tuples as defined in terms of their range in connection with the value of key for the clustered index as noted above, the main frame 15 operates first to refer to the pointers 14a, 14b to the clustered indexes 12, check in which page the relevant cluster of tuples are stored, and read them together by way of page out of the disc storage unit 16 for processing. Since the cluster of tuples as defined in their range in accordance with the key value of the clustered indexes is put together by way of page and stored into the disc storage unit 16 with their range being restricted rather physically, it may suffice to read-out only a due page from the disc storage unit 16, thus making a processing substantially quicker than the case having no clustered indexes, accordingly. For instance, in FIG. 9, by virtue of a cluster of tuples as existing with the key value in the range from 99 to 190 in the page 13b alone, it would be enough to read the page 13b only from the disc storage unit 16, thus resulting in an quicker processing.
While a high speed processing may be attained by way of the adoption of the clustered indexes according to the conventional data processing unit, as demands for data base management increase lately and as demands for a quicker data processing grow extensively, it is difficult to make the data processing further quicker by way of the conventional data processing which is managed by a single main frame per se.
In consideration of such drawbacks particular to the conventional construction of a relational data base management system in mind, it is desired to attain an efficient solution for overcoming such inevitable problems particular to the conventional construction.
The present invention is essentially directed to the provision of a proper solution to such inconveniences and difficulties in practice as outlined above.