(1) Field of the Invention
The present invention relates to the field of computer processing. More specifically, the present invention relates to a method for preventing resource conflicts in a multiprocessing computer system.
(2) Art Background
In distributed data processing systems, data objects such as database tables, indexes and other data structures are often shared by multiple processes. If a data object is accessed by two or more processes during the same time interval, problems may arise depending on the nature of the access. For example, if one process attempts to write a data object while another process is reading the data object, an inconsistent set of data may be obtained by the reading process. Similarly, if two processes try to write the same data object during the same time interval, data corruption may result. In both cases, the accessing processes are said to have a "resource conflict", the resource being the shared data object.
Resource Locking
Resource locking is one technique used to avoid resource conflicts in multiprocessed applications. Resource locking is a protocol in which processes signal their use of a resource by "acquiring a lock" on the resource from a lock manager. These resources may include data objects, processes, memory regions, registers, I/O devices, etc.. Before using a resource, a requesting process checks to see if the corresponding lock is available. If so, the requesting process acquires the lock and uses the resource, releasing the lock when finished. If the lock is not available, the requesting process waits for the lock to be released by the prior acquiring process before using the resource.
In most resource locking systems, multiple processes may concurrently lock a resource so long as their respective uses of the resource are compatible. Generally, a process indicates the nature of its intended use of a resource by acquiring a lock to the resource in a particular mode. For example, a process that needs to write to a resource would typically acquire the corresponding lock in an exclusive mode to indicate that no other process should have read or write access to the resource. Once a resource is locked by an exclusive lock, requests to obtain the lock for read or write purposes are denied. On the other hand, a process that merely needs to read the resource could acquire the lock in a non-exclusive mode to indicate that other processes may concurrently read the resource. Since the locking mode is indicative of the use of the resource, it is common to refer to locks themselves as being compatible or incompatible with each other. For example, exclusive locks are incompatible with any other resource access lock, while non-exclusive locks are generally compatible with other non-exclusive locks.
One undesirable situation that can arise in a resource locking scheme is a deadlock. A deadlock occurs when a first process attempts to acquire a lock on a resource that is already incompatibly locked by a second process, and the second process likewise attempts to acquire a lock on a resource that is already incompatibly locked by the first process. Since neither process is able to release the lock sought by the other until acquiring the lock held by the other, neither process can proceed. Deadlocks are typically resolved by terminating one of the deadlocked processes to allow the other process to continue. As discussed below in the context of database management, depending on the amount of processing accomplished by a deadlocked process prior to its termination, considerable time and processing effort may be wasted as a result of the deadlock.
Resource Locking in a Database Management System
One application for resource locking is database management. In a database, data is typically stored in tables where it is organized into rows (entries) and columns (categories). Before a process operates on a particular database table or a particular set of entries in the database table, it is often desirable for the process to lock the table or a particular set of entries therein to prevent other processes from performing incompatible operations on the table. Then, after the process has completed operating on the database table, the lock is released.
One type of database table that presents particularly challenging locking issues is called a "partitioned" table. A partitioned table is a table that has been decomposed into smaller, more manageable pieces called partitions. In some cases a partitioned table includes entries that have been distributed among two or more computers of a computer network. When a request to perform an operation on a partitioned table is received, a process called a "coordinator process" is invoked to identify the different table partitions to be operated on and to initiate a number of worker processes to carry out the necessary operations. From a resource locking standpoint, the challenge is for the worker processes to lock different table partitions without deadlocking with one another or with other processes. This is made more or less difficult by the manner in which the coordinator process allocates work to the worker processes.
A coordinator process may allocate work (requests to operate on partitions) to worker processes according to one of two work allocation models: static and dynamic. In the static work allocation model, the coordinator process assigns each of the necessary partition operations to the worker processes before the workers begin processing. The advantage of static work allocation is that the coordinator process can determine at the outset the resources needed by each worker process to accomplish its respective work assignment. This enables the coordinator process to allocate work in such a way as to avoid deadlocks between worker processes.
A major disadvantage of static work allocation, however, is that optimum parallel processing by workers is often not achieved. This is because workers do not necessarily complete their assigned work at the same rate (due to a number of factors, including difference in processor speed, network connection delay, unequal work load, etc.), so that one worker process may finish significantly faster than other worker processes. Since all of the work has already been allocated by the coordinator process, however, the first-finished worker process will remain idle (or terminate) when it potentially could have been assigned work not yet completed by another worker process. Consequently, in the static work allocation model, parallel processing is reduced, increasing the time required to complete the requested database operation.
In the dynamic work allocation model, the coordinator process allocates less than all of the work to an initial set of worker processes. Then, when one worker process of the initial set of worker processes finishes its assigned work, the coordinator process allocates more work to that worker. The advantage of dynamic work allocation is that a set of N worker processes are made to operate concurrently until the overall database operation is complete or nearly complete. This is in contrast to the static work allocation model in which N worker processes may operate concurrently for only a small portion of the time required to complete the overall operation.
One disadvantage of dynamic work allocation is that, because it is uncertain at the outset what resources may need to be locked by any given worker process, deadlocks become much more likely. Worse, since the initially scheduled work will tend to be free from deadlock, any deadlocks which arise will likely occur after substantial work has already been performed. Since, in a database management system, terminating a deadlocked process usually requires all processing performed up to the termination point to be undone ("rolled back"), late detected deadlocks often require substantial roll back before the overall operation can begin again. This can result in many hours of lost processing and considerable expense.
What is needed is a method for allowing work to be dynamically allocated to worker processes, but with early detection of deadlock conditions. This way, the efficiency obtained by parallel processing in worker processes could be maximized and any deadlock conditions could be detected before significant processing has taken place. This advantage and others are achieved by the present invention.