1. Field of the Invention
The invention relates generally to methods and apparatus for managing access to shared data in a data processing system. More particularly, the invention relates to methods and apparatus which allow multiple accessors of shared data to read and modify such data in a fast and orderly manner while assuring the integrity of the shared data.
2. Description of the Related Art
Many techniques are known for managing access to shared data in data processing systems. The general objects of such techniques (commonly referred to as data base management techniques) are to prevent shared data from being modified by two (or more) different accessors at the same time, and to prevent data from being read while it is being modified. These objects are typically achieved by using some sort of system synchronization function, commonly called "Lock" or "Seize".
The choice of the data base management technique used may depend on how the data is organized, what the system throughput requirements are, what the projected activity is with respect to shared item accesses, etc. Conversely, these factors can be affected by the choice of data base management technique as well. For example, the granularity at which the aforementioned synchronization is performed can have a significant effect on system performance as measured in terms of system overhead and throughput.
In systems like the commercially available IBM S/38 and AS/400, synchronization is performed at the data base file level where a file is a (potentially) large number of discrete data entities called records. This is efficient in terms of synchronization overhead, since many records can be processed with one synchronization. However, if there are many accessors each wishing to operate on different records within the same file, each is serialized, and throughput suffers. This contention can lead to severe performance problems, particularly when operations on very active objects must be serialized.
Other known commercially available systems, like the DB2 and SQL/DS which run on the IBM system 370, synchronize at either a page or record level. This reduces(compared with file level synchronization techniques) contention among multiple accessors that wish to operate oil separate areas of the file; however, additional overhead is introduced since each page or record must be synchronized to process many records sequentially. In most systems, the processing overhead required to acquire and release a lock can be very significant and hence, synchronizing at the page or record level may be impractical or undesirable.
The aforementioned synchronization function itself may be carried oat in different ways, each of which has its own impact on system performance. For example, synchronization may be accomplished through the use of a lock manager that uses "shared" and "exclusive" locks to protect shared data.
It is well known to protect data from being changed during a read operation with a "read" (shared) lock that excludes writes, but which allows shared access to (i.e., allow other reads of) data protected by the lock. Such a lock would assure that no modification to protected data is possible while the read lock is in effect.
"Read/write" (exclusive) locks typically exclude all read and write access to the portion of storage managed by the lock, other than read or write accesses initiated by the lock holder.
Shared and exclusive locking techniques employed at the record level are particularly costly from an overhead point of view. As the size of the data base goes up, significant amounts of time can be expended locking and unlocking individual data records, as indicated hereinabove with reference to DB2 and SBL/DS. Degradation of system performance (due to increased CPU utilization) is likely to occur in certain instances as a result of utilizing such techniques.
Many articles and inventions have been directed to improving data base management techniques for computer systems generally, and particularly those that organize data in the form of discrete records. For example, in the IBM Technical Disclosure Bulletins Volume 25, No. 7B, published in December of 1982, pp. 3725-3729, and Volume 25, No. 11A, published in April of 1983, pp. 5460-5463 techniques for locking portions of tree form indices are described, where the record oriented trees provide keyed access to a data base.
U. S. Pat. Nos. 4,480,304 and 4,399,504, in the context of a data base system distributed across a loosely coupled multiple processor architecture, describe lock managers that use hash tables to maintain the lock status of discrete shared resources. In particular, the hash tables are used as part of a shared/exclusive access system which record (in the hash tables) which locks are held by the various potential accessors.
The systems described in the U.S. Pat. Nos. 4,480,304 and 4,399,504 references require the expenditure of lock manager overhead for every access of shared data (again, very expensive when management is performed at the record level), and the further expenditure of lock manager overhead in the event of hash table "collisions" (two sets of lock information mapping into a single hash table entry). In particular, data chaining off a hash table entry is used whenever a hash collision occurs to preserve data necessary in implementing the lock management schemes taught by the references.
U.S. Pat. No. 4,604,694 describes a fast means of acquiring a shared lock and recording the shared lock acquisition. Once the lock is obtained, the shared resource can be examined without risk of interference from another exclusive lock holder. The techniques taught in the U.S. Pat. No. 4,604,694 patent require the acquisition of a shared lock to assure data integrity even though no exclusive seize necessarily occurs during the performance of the read. In instances where the seize shared lock is acquired and no exclusive seize intervenes, the overhead associated with acquiring the shared lock is, in effect, wasted.
Still other shared data management schemes are known that are not lock oriented. For example, U.S. Pat. No. 4,627,019, accomplishes data base access control by "versioning", i.e., by maintaining multiple copies of data rather than by locking. Such techniques can be problematic because of storage requirements, copy maintenance overhead, etc.
None of the known techniques for assuring the integrity of shared data, particularly none of the techniques that employ lock managers to oversee and implement shared and exclusive locking schemes, effectively deal with the aforementioned system overhead and throughput performance problems.
According it would be desirable to provide methods and apparatus which improve the overall system performance of those systems that utilize shared and exclusive locks to manage access to shared data. In particular, it would be desirable to be able to provide methods and apparatus which only cause seizes to be initiated in certain predefined instances for reads of shared data and which, in other instances, are able to assure the integrity of a shared read without having to incur the significant overhead typically associated with a seize.
Furthermore, it would be desirable to be able to provide methods and apparatus, for use in systems having shared data resources, which reduces the contention in such systems that traditionally seize or lock the file or object level.
Still further, it would be desirable to be able to reduce the processing overhead (and thereby improve the throughput of) those systems that lock or seize at the page or record level.
Further yet, it would be desirable to be able to provide methods and apparatus which can be useful in improving system performance in situations where shared and exclusive access modes must be provided for a large number of objects on an individual basis. This is particularly desirable in situations where a great deal more read (shared) access is expected to be performed than write (exclusive) access.