(1) Field of the Invention
The present invention generally relates to a file storage management system used for a database in which the so-called clustering is promoted, and more particularly to a file storage management system in which data items are arranged in a data storage file so as to be close to each other in accordance with the relevance among a plurality of relations each of which is a set of data items in a database.
In an engineering application field such as a field of a CAD (Computer Aided Design), high speed responses are required under conditions in which processing and computation with respect to a large amount of data are being performed. Thus, a relational database management system (RDBMS) suitable for business processing by which simple numerical computation of employee information is performed and character strings are treated is not suited for the engineering application field.
(2) Description of the Related Art
In a conventional relational database management system (RDBMS), relations each of which is a set of data items are managed so as to be independent from each other. On the other hand, in the engineering application field, there are many dependent relationships as components among data items. Since overhead occurs when the relations coupled to each other under dependent relationships are loaded into a main storage, the RDBMS is not suited for the engineering application field.
An object-oriented data base system (OODBMS) has been recently proposed for the engineering application field. In the OODBMS, data items relevant to each other are arranged so as to be physically close to each other, and the data items relevant to each other are loaded into the main storage at once to the utmost, so that the response in a case using the database is improved. In, for example, an "O2" which is the OODBMS developed in Altair of France (Fernando Velez, Guy Bernard, Vineeta Daris: The O2 Object Manager: an Overview Proc. 15th VLDB Conf. 1989) and a "WiSS" which is a storage management system developed in Wisconsin university in U.S. (H-T. Chou, David J. Dewiti, Rnady H, Katz, Nothony C. Klug: Design and Implementation of the Wisconsin Storage System, Software Practice and Experience, Vol. 15 (10) pp. 943-962, 1985), the above relationships among data items are realized as follows.
(1) The clustering regarding records is used in which records having data items relevant to each other are arranged in the same page to the utmost.
(2) The chaining of data items in the same relation is formed by using of address linkage.
A description will now be given, with reference to FIG. 1, of the concept of the clustering.
In an example shown in FIG. 1, each of data items used in a CAD for circuit design belongs one of three kind of relations: a "symbol" representing components in an electrical circuit, a "pin" representing pins of each of the components, and a "net" representing connection of signal lines to the pins of each of the components.
The "pin" depends on the "symbol" so that a relationship of "pin is a part of symbol" exists. When a symbol (an owner) is made, a set of pins (members) dependent thereon is also made, and when the symbol (the owner) is withdrawn from the file, the set of pins (members) dependent thereon is also withdrawn. That is, the "symbol" and the "pin" are statically closely related to each other. On the other hand, although there may be a case where a plurality of pins depends on a net in a connecting state of signal lines and pins, the dependent relationship between the "pin" and the "net" is changed due to changing the connecting state of the signal lines and pins. Thus, data items belonging to the "symbol" and the "pin" should be arranged so as to be close to each other in the file, but it is not preferable that data items belonging to the "pin" and the "net" be arranged so as to be close to each other.
In the example in which the database is formed in a secondary storage device (e.g. a magnetic disk device) as shown in a bottom portion of FIG. 1, data items of "net 1, net 2, . . . " belonging to the "net" are stored in a page (n) different from those in which data items belonging to the "symbol" and the "pin" are stored. As to the "symbol" and the "pin", the clustering regarding the record is applied to the file so that data items of "symbol 1 and symbol 2" and sets of pins dependent on the symbols 1 and 2 are stored in a page (7). However, pins dependent on data items of "symbol 3 and symbol 4" are stored a different page if the page (7) does not have enough remaining area to store them. In this embodiment, the pins dependent on the data items of "symbol 3 and symbol 4", other data items "symbol 5, . . . " and pins dependent on the other data items are stored in a page (8) separated from the page (n).
However, in the above clustering regarding the record, there is no guarantee that pages in which data items belonging to the "symbol" and "pin" are stored are located so as to be close to each other, and that a page in which data items belonging to the "net" is separated from the above pages. That is, the clustering regarding the page is not guaranteed.
In addition, in each of relations such as the "symbol", "pin" and "net", the chaining of data items is represented by the "address-linkage" using record addresses or page addresses. Although there is no order relationship among the relations, the "address-linkage" is used. If the order relationship is needed among the relations, the address-linkage is generally represented by a secondary index such as a B-tree.
In a case where data items belonging to each relation or belonging to a plurality of relations relevant to each other are loaded into the main storage at once, a navigation of the address-linkage is performed. In this case, there is no guarantee that addresses in the address-linkage are arranged in a physical order of pages. The address-linkage represents the changing of data items in only a single relation, so that the navigation of the address-linkage must be repeated at a number of times corresponding to a number of relations in a case where data items belonging to a plurality of relations are loaded into the main storage.
However, it takes a long time to perform a physical operation required for a access of a single record in the secondary storage unit (e.g. the magnetic disk unit) having many records on which data items belonging to the relations relevant to each other are stored. Thus, it is desirable that the number of accesses to the secondary storage unit (e.g. the magnetic disk unit) be as small as possible. For this reason, areas in which data items are to be stored must be arranged so as to be close to each other to the utmost or so as to be separated from each other in accordance with the owner-member relationship/inclusion relationship of classes.
In the light of the above matter, the conventional system of "clustering regarding the record" has the following disadvantages.
(1) Pages in which data items belonging to relations relevant to each other are not always arranged so as to be close to each other.
(2) When data items belonging to relations are loaded into the main storage at once, the data items can not read out, in an order in which data items are physically arranged in the secondary storage unit, from the secondary storage unit. Thus, the number of seek operations in the secondary storage unit is increased.