This invention relates to data processing of a system having the data structure in which a plurality of objects are correlated by the pointer that operates a plurality of process in parallel that is served to refer, update, insert, and delete the object.
1. Related Art 1
The data management system having a function to take access quickly to the data by means of the index related to the present invention provides the following four functions as the method to take access to the basic data.
(1) reference function: reference to the data correlated to a specified key value
(2) insertion function: insertion of a specified key value and insertion of the specified data correlated to the specified key value
(3) deletion function: deletion of a specified key value and the data correlated to the specified key value
(4) update function: update of the data correlated to a specified key value to the specified data
A plurality of these functions are requested for processing in parallel. The processing method for realizing the respective functions is called as reference processing, insertion processing, deletion processing, and update processing.
Conventional technique used for realizing these functions will be described with reference to FIG. 1 and FIG. 2. Numeral 101 shown in FIG. 1 is an exemplary data structure. The numeral 101 shows an object group T comprising three objects A, B, and C. The object is the data unit, and the region is secured and released in object units. Furthermore, one data is stored in one object.
The object B stores the key value 10 and the data correlated to the key value 10. Similarly, the object C stores the key value 20 and the data correlated to the key value 20. The object A stores the key held by the object B (namely the key value 10), the storage position information of the object B (referred to as pointer hereinafter), and the key held by the object C (namely the key value 20) and the pointer of the object C.
A method for realizing the above-mentioned four functions will be described herein under. It is required to set and release the lock in order to realize parallel operation of a plurality of the functions. A method for realization of the lock is described in, for example, “Gray, J. Reuter, A. TRANSACTION PROCESSING: CONCEPTS AND TECHNIQUES, Morgan Kaufmann Publishers, Inc., 1993, p 449-484”. Herein, the lock of the respective S, X, IS, and IX modes locking described in the above-mentioned literature is used. The S-mode locking can be executed together with the IS-mode locking in parallel, the X-mode cannot be executed with locking in all modes in parallel, the IS-mode locking can be executed together with locking of S, IS, and IX modes in parallel, and the IX-mode locking can be executed together with the lock of IS and IX modes. The lock is set hierarchically in the strategy of lock setting. In detail, the IS-mode lock is set to the object group T in the case of the reference processing, the IX-mode lock is set to the object group T in the case of the insertion, deletion, and update processing, and the S-mode lock is set in the case of the reference to an object that takes access thereafter and the X-mode lock is set in the case of the update (probable). In other words, the lock is set to the object group, and thereafter the lock is set to a specified object.
The resource name used when the lock is set to the object group that is known commonly for all the respective processing is assigned. Furthermore, in the case that the lock is set to an object, the resource name corresponding to the storage position of the object (position information of the resource name where the object is stored, for example, pointer value) is assigned. In other wards, it is possible to set the lock to an object if the storage position is found. All the processing knows the storage position of the object (object A in FIG. 1) located at the origin in the object group, and the storage point is not moved.
The time chart of the respective processing is shown in FIG. 2. The time on these time charts elapses from the left to the right. A line segment having black circles on both ends represents a time period while the lock is being set to the object shown on the left side of the line segment, and the lock mode is shown above the line segment and the processing is shown under the line segment. A line segment having white circles on both ends represents the time period while the object shown on the left side of the line segment is being accessed (without setting the lock), and the processing is shown under the line segment. FIG. 2A shows a flow chart of reference processing. Herein, an example in which the object having the key value of 10 is referred is shown. FIG. 2B shows a flow chart of insertion processing. Herein, an example in which the object having the key value of 30 is inserted into the group T is shown. FIG. 2C shows a flow chart of deletion processing. Herein, an example in which the object having the key value of 20 is deleted is shown. FIG. 2D shows a flow chart of update processing. Herein, an example in which the object having the key value of 10 is updated is shown.
To show the situation of the parallel processing, an exemplary case in which, for example, insertion processing of the key value of 30 and deletion processing of the key value of 20 are operated simultaneously is described. In the insertion processing 202, at first the IX-mode lock is set to the object group T. The object region for storing the new data is allocated (object D), and the key value 30 and correlated data are set in the region. On the other hand, in the deletion processing 203, the IX-mode lock is set similarly to the object group T, the lock is allowed to accept parallel execution even though the lock competes with IX in the insertion processing, and it is not the case in which any one of processing must wait. Next, in both the insertion processing 202 and the deletion processing 203, the X-mode lock is set to the object A, the one processing has set the lock at first, and the other processing must wait to set the lock until the one processing that has set the lock at first releases the lock.
At first, the case in which the deletion processing 203 successively set the lock prior to the insertion processing 202 will be described herein under. The insertion processing must wait until the deletion processing 203 releases the lock of the object A. The deletion processing 203 takes access to the object A to thereby acquire the pointer to the object B corresponding to the key value of 20, sets the X-mode lock to the object C, deletes the key value of 20 in the object A and the pointer to the object C (referred to as removal of the pointer hereinafter) to thereby release the lock to the object A (at this time point, waiting of the insertion processing 202 is released), the object region that stores the object C is released and the lock of the object C is released, and the lock of the object group T is released finally. This final state is shown in FIG. 102.
In response to release of the lock of the object A in the deletion processing 203, in the insertion processing 202 the lock of the object A is set successfully. In the insertion processing 202, the object A accepts an access, the key value 30 and the storage position of the object D are set to the object A, the lock of the object A is released, and the lock set to the object group T is released. This state is shown in 104 of FIG. 1.
On the other hand, in the case that the lock is set successfully prior to the deletion processing in the insertion processing 202, the deletion processing 203 must wait until the lock of the object A is released in the insertion processing 202. The insertion processing 202 carries out the processing in the same manner as described hereinabove, and the state 103 shown in FIG. 1 is brought about. Thereafter, the deletion processing is executed, and 104 shown in FIG. 1 is brought about.
The reference processing (key value of 10) is shown in 201 shown in FIG. 2A, and the update processing (key value of 10) is shown in 204 shown in FIG. 2D. The reason why the lock of the object B has been set before the lock of the object A is released in the reference processing 201 and the update processing 204 is that the error operation is to be prevented when deletion processing of the object B is operated in parallel. In detail, for example, if the lock of the object B is set after the lock of the object A is released, it is possible to release the region of the object B carrying out by the deletion processing during the release, and furthermore the released region is re-allocated to other use and the data that is set in the region is regarded by mistake as the data of the object B to cause error operation.
Another method in which the X-mode lock is set to the object group T for insertion, deletion, and update of the S-mode lock on the object group T may be employed as an easy method for the reference processing. Because parallel processing cannot be executed by means of the above-mentioned method excepting in the case that the reference processing is executed in parallel with another reference processing, the resources such as disk and processor is used not effectively in comparison with the above-mentioned example, and the throughput and response time are poor.
As described hereinbefore, though the related art 1 is advantageous in that the object is deleted by setting the lock to two objects simultaneously, for example, the object B is locked in the state that the object A is being locked, and the region of the object that has been deleted is released, however, the related art 1 is disadvantageous in that the parallel execution performance is somewhat poor.
2. Related Art 2
B-tree index in the DB management system has the data structure that is formed by expanding from the related art 1. 701 in FIG. 7 shows an exemplary three-step B-tree index comprising 7 pages (equivalent to the object in the related art 1). Pages P4, P5, P6, and P7 located at the bottom of 701 are called as leaf page, and store a pair of one or more keys (values appears at the bottom on a page, for example, key value of 50 and key value of 70 on page P6) and storage position information of the corresponding data, the pointer of the page on the right side (excepting the rightmost pate P7), the Max key value in the page (the value appears at the upper right in the leaf page), and the maximum key value of the data stored in the page (for example, the key value of 20 is the Max key value in the page P4) respectively. The key range that is to be store is set to each leaf page, and it is shown by the in-page Max key value of the corresponding page and the page positioned at the right side of the corresponding page. For example, the key range of the page P5 that is to be stored is larger than the in-page Max key value of 20 of the page P4 that is positioned at the left side of the page P5, and smaller than or equal to the in-page Max key value of 40 of the page P5. Similarly, the key range that is to be stored in the pages P6 and P7 is the key larger than 40 and smaller than or equal to 70, and larger than 70 respectively. The page P4 that has no page positioned on the left side is served to store the key smaller than or equal to in-page Max key value (smaller than or equal to 20).
The page called as upper page (the page P1, page P2, and page P3 of 701 in FIG. 7, called as node) is arranged above the leaf page. The upper page stores a pair of one or more pointers to the page located immediately under it and a key value, and the pointer to the page located on the right side (if there is). Herein, the key value is identical with the in-page Max key value of the page pointed by the pair of pointers. Particularly, the page P1 is called as root page. To search the leaf page in which a certain key value is stored, it is the way that an access is taken from the root page and the pointer that is paired with the minimum key value that is larger than the target key value is traced. The region is secured in page units.
The deletion processing in which the key value in a range from 20 to 40 will be described with reference to 801 in FIG. 8A. At first, the IX-mode lock is set to the whole B-tree index T and the S-mode lock is set to the root page P1, the pointer to the page P2 that is paired with the key value of 40 that is larger than the key value of 20, which is the lower limit of the specified range, and is the minimum key value in the P1 is acquired, and the lock of the page P1 is released. The S-mode lock is set to the page P2, the pointer to the page P4 that is paired with the key value of 20 that is larger than the key value of 20 and is the minimum key value in the page P2 is acquired, and the lock of the page P2 is released. The X-mode lock is set to the page 4, and all the keys in a range from the key value of 20 to 40, which are the target in the page P4 to be deleted, are deleted.
At that time, because only the key value of 20 exists in the page P4, no key remains in the page, and the region of the page P4 is not released. The significant reason is that it is required to reset the pointer to the target page to be released while a suitable lock is being set, and this resetting causes the poor parallel execution performance.
Because the in-page Max key value of the page P4 is 20 and it is known that the value larger than 20 out of the range of key value from 20 to 40, which is the target to be deleted, is stored in the right side page, the pointer to the right page (pointer to the page P5) is acquired, and the lock of the page P4 is released. The X-mode lock is set to the page P5, and all the key values in the rage of key value from 20 to 40, which are the target to be deleted in the page P5, are deleted. At that time, because there is only the key value of 40 in the page P5, no key value remains in the page, the region of the page P5 is not released. Because the in-page Max key value of the page P5 is 40, all the keys in the rage of key value from 20 to 40 are deleted. The lock of the page P5 is released, and the lock of the whole B-tree index T is released finally to complete the deletion processing. The state in which the deletion processing has been completed is shown in 702 in FIG. 7.
Furthermore, the procedure for carrying out the insertion processing (key value of 60) starting from the state of 702 in FIG. 7 will be described in 802 shown in FIG. 8. The IX-mode lock is set to the whole B-tree index T and the page is accessed to the page P1 and page P3 in the same manner as described hereinabove, the X-mode lock is set to the page P6, and the page P6 is accessed. It is assumed that the data of the key value of 50 and 70 has been stored already in the page P6, and the page P has no sufficient space for storing the data of the key value of 60. In this case, the page division processing called as splitting is carried out. At first, a page (assumed to be the page P8) is secured additionally. Herein, the divided key value is assumed to be 50. In detail, the data of the key value of 50 remains in the page P6, the data of the key value of 70 and the data of the key value of 60 to be inserted are transferred to the page region that has been secured additionally. The required setting is carried out to the page P8, the in-page Max key value of the page P6 is reset to 50, and the pointer on the right page is reset to the page P8. Thereafter, the lock of the page P6 is released, the X-mode lock is set to the page P3 to reset the pointer, the lock of the page P3 is released, and the lock of the whole B-tree index T is released finally to complete the insertion processing. The state in which the insertion processing has been completed is shown in 703 in FIG. 7.
As described hereinbefore, to maintain the high parallel execution performance of the B-tree index, the region of the page is not released even though no data remains in the page due to the deletion of the data. Accordingly in general, the storage efficiency becomes poor with repetition of insertion and deletion of the data, and the access performance becomes poor concomitantly. To suppress the deterioration of the performance, a method (rearrangement) has been employed generally, in which the access is inhibited at the proper timing, the data stored in the index is sent out temporarily to the region, and the data is re-packed.
As described hereinabove, two pages are not locked simultaneously in the related art 2. Therefore, the parallel execution performance is high. On the other hand, because the page is referred by mistake in some cases as described in the related art 1, the region cannot be released successively even though the data is deleted.