1. Field of the Invention
The invention relates to an object-oriented database management system for performing join operations in order to efficiently process a query including a conditional expression which needs to trace a pointer reference.
2. Description of the Related Art
An object-oriented database is a database which manages an object having data and its operation procedure combined as one unit. Each object has an identifier to uniquely identify it, and this identifier is called an object identifier. Respective objects are divided into a category having a common attribute value and an operation procedure, and this category is called a class. In the database, a subset or universal set which belongs to a certain single class is stored and managed as a meaningful set. Processing which selectively processes a given object set from a set of objects stored in the database is called query processing. To request such query processing, its type is designated by the conditional expression.
Meanwhile, object-oriented database management systems have been developed that are designed to be able to write as a pointer to the object identifier of another object into a query conditional expression. Generally, since the object identifier is produced based on an address in a disk area on which the applicable object to be identified by the applicable object identifier is disposed, it also indicates physical page information on the applicable object.
For example, in the case that Hotel A has a reservation instance for each reservation, a pointer (object identifier of guest instance) to a guest instance is set in the reservation instance and a pointer to the guest name is set in the guest instance, a set of names of guests who have made a reservation at Hotel A can be described by using a mark ".fwdarw." which indicates the pointer reference, for instance, as follows:
Hotel.multidot.set of reservations.fwdarw.guests.fwdarw.names. And, in the case that Company B has an employee instance for each employee and a pointer to an employee name is set in the employee instance, a set of names of the employees of the company can be described by using the pointer reference, for instance, as follows:
Company.multidot.set of employees.fwdarw.names. Consequently, a query conditional expression "to retrieve an employee of Company B who has made a reservation at Hotel A" can be expressed as follows:
Hotel.multidot.set of reservations.fwdarw.guests.fwdarw.names==company.multidot.set of employees.fwdarw.names.
When performing such query processing, which has the conditional expression containing the pointer reference as described above, whenever the name of a guest having made a reservation at Hotel A is retrieved by tracing the pointer, it is checked whether the above name is one of the employees of Company B. But, to check the one employee of Company B, such a method needs to access all employee names for each name of the guests at Hotel A, and when the query is made with respect to a large-scale object set, processing efficiency is quite poor.
In view of the above circumstances, the invention employs a join operation which concatenates elements agreeable between sets to realize efficient processing of a query which includes a pointer reference in a conditional expression. And, utilizing the speciality of the object-oriented database, which contains a pointer reference in a conditional expression, a set, which forms input information at the time of a join operation, is made not of concatenated key values themselves but of a set of pointers to the concatenated key, thereby reducing the required memory area for holding input information at the time of the join operation.
The join operation is a method which is also used for the relational database, and the following are known.
(A) nested loop method
(B) sort-merge method
(C) hashing method
Among the above methods, the hashing method requires the least amount of data comparison at the time of the join operation. Information Processing Society of Japan, Collection of Papers Vol. 30, No. 8 has on pp. 1024-1032 a paper entitled "Hashing Method based on Dynamic Processing Bucket Selection Method and Evaluation of Its Performance", which introduces three classical hashing methods such as a simple GRACE method, a multi-split GRACE method, and a hybrid hashing method. And, it also introduces a dynamic processing bucket selection type join operation processing method, as an improved method.
This dynamic processing bucket selection type join operation processing method adopts an algorithm which combines the multi-split GRACE method which previously takes a large number of buckets to be divided and the hybrid hashing method which partially overlaps the split phase processing and the join phase processing to reduce the I/O cost so that a bucket exceeding the main memory size is prevented from being produced even when the data distribution is varied. And, while the hybrid hashing method determines a bucket to statically perform the overlap processing, the improved method dynamically selects the bucket by the split phase to keep a high processing efficiency without depending on the data distribution of each bucket. Furthermore, the increase in I/O cost to a fragment page, which is caused when a large number of split buckets is provided in advance or the multi-split is made, is solved by performing integration processing prior to the join phase processing.
However, such a conventional join operation technology using a hash function does not secure dynamically the page used by each bucket, having a disadvantage that the usage efficiency of the main memory is poor.