In a conventional database, some techniques as presented below are used in multiple execution of transactions (such as reference operation or update operation) in order to guarantee integrity and consistency of data while ensuring simultaneous executability.
(1) Technique for control reference and update operations by locking database resources such as rows
(2) Technique for managing the row in multiple versions and enabling reference to data before update during the update
(3) Technique for managing the row in two or more versions and retiring unnecessary records
In the technique (1), data consistency is ensured by applying share lock to a target resource in a reference operations and exclusive lock to a target resource in an update operation. For a share-locked resource, only a share lock is permitted and other operations are not permitted. For an exclusive-locked resource, any type of lock operation is locked. In this manner, presently referred and/or updated resources can be prevented from being updated, thereby ensuring the data consistency. However, since the reference operation and the update operation are serialized (lock suspension), there arises an object of improved concurrency (parallelism). In addition, as a result of locking of database resources, lock suspension may occur, which may make it difficult to design lock mechanisms in operation construction.
In the technique (2), multiple versions may be generated for each record. In other words, when a record is updated, a new version of the record is generated. In a reference operation, a referable version of the record is referred to. Accordingly, different versions of a target resource are applied in the reference operation and the update operation. As a result, lock suspension can be suppressed in the reference operation and the update operation. However, since it is not identified when an earlier version of data should be discarded, the earlier version of data cannot be discarded, which may lead to an increase in data areas.
In the technique (3), a record is managed in two versions plus something extra in order to enable a reference operation and an update operation to be performed concurrently, and a reference counter is attached to each of the versions of the record. Specifically, the reference counter for a target physical record is incremented by one at access to the physical record after start of a reference operation whereas the reference counter is decremented by one at completion of the reference operation. Then, it is determined which earlier version of the record (earlier record) has not been referred to and can be deleted based on the reference counter, that is, which earlier version of the record has the reference counter equal to 0, and the determined version of the record is deleted to suppress the increase in data areas. In a fixed number of versions, however, if a number of versions more than the fixed number of versions are generated, the earliest version would be deleted. In this case, there arises a risk that no operation with the earliest version can be performed. Thus, if there is a referred earlier record, that is, an earlier record with the reference counter equal to one or more, at completion of an update transaction, a fake deletion called “delayed deletion” is performed. In the delayed deletion, a flag called a deletion bit attached to a physical record is set to indicate that a fake deletion has been performed. The delayed deletion may disable the associated earlier record to be accessed except for reference operations for referring to that earlier record. After all the reference operations are completed, that is, after the reference counter reaches 0, the record subjected to the delayed deletion is deleted. See Japanese Laid-open Patent Publication No. 2007-501468 and an online document “Postgresql Documentation Manuals PostgreSQL 7.1 Multi-VersionConcurrency Control”, searched on Apr. 25, 2008, over the URL “http://www.postgresql.org/docs/7.1/static/mvcc.html”.
In the technique (3), however, a reference counter is attached to a physical record and updated for each reference. As a result, whenever the reference counter is updated, an I/O (Input/Output) operation may occur for rewriting in a database, which may reduce the speed of the reference operation.
In addition, if a DBMS (DataBase Management System) fails, the reference counter and/or others written in physical records must be modified at recovery, for example, resetting of the reference counter for all records to zero. If such modification is not made, the reference counter is not reset to zero after recovery and thus unnecessary records may remain. As a result, some input/output (I/O) operations to/from CPU, memories, secondary storages and/or others may occur.
Furthermore, in the technique (3), once an earlier referable version of record becomes unnecessary after completion of updating, the unnecessary version of record is retired and cleaned up through reference operations by resetting the reference counter to zero. As a result, the reference operations is made slower.
Still furthermore, in the technique (3) in addition to the normal two versions, a version is used for disabling access for operations other than a presently conducted reference operation in delayed deletion. For this reason, when capacity of database resources is designed, capacity for the delayed deletion must be taken into account. For example, if a certain presently referred record is updated twice, capacity for three versions of record, that is, the earliest version of record in delayed deletion, a referable version of record and a presently updated version of record, must be reserved during the second update. Since it is difficult to comprehend in advance when and/or how much the delayed deletion will be conducted, it may be hard to estimate the capacity of database resources.