Data processing systems have long used locking mechanisms as a means for ensuring data integrity during, for instance, write and update operations.
These locking mechanisms typically include locks, which are used to associate state information with the name of a resource to be locked. The state information indicates accessing privileges (e.g., shared, exclusive), ownership, etc.
One example of a system using locks to ensure data integrity is a Parallel Sysplex.TM. configuration offered by International Business Machines Corporation. The Parallel Sysplex.TM. configuration conforms to the S/390.RTM. architecture described, for example, in the IBM publication Enterprise Systems Architecture/390 Principles of Operation, SA22-7201-04, June 1997, which is hereby incorporated herein by reference in its entirety.
The IBM Parallel Sysplex configuration includes two or more processors interconnected via a coupling facility to form what is known as a "sysplex" (system complex). The coupling facility contains storage accessible by the processors, performs operations requested by programs in the processors and enables the sharing of data by the processors coupled thereto. The data are stored in storage structures, such as cache structures and/or list structures. In one example, the list structures include locking structures, such as lock tables. The lock tables include the locks used to ensure data integrity during write and update operations.
When a user of a coupling facility lock structure (e.g., a lock table within a list structure) terminates or disconnects while holding locks in the coupling facility, a cleanup process is used to cleanup the locks held by the terminated or disconnected user. This cleanup process is performed by all of the surviving users of the lock structure.
The current process for cleaning up the lock table, by each surviving user, includes using a coupling facility command to search through the lock table entries in ascending order until a lock table entry indicating a lock held by the disconnected or terminated user is found. When such an entry is found, the coupling facility returns the lock table entry number for which cleanup is needed and the contents of that lock table entry. This command may time-out and need to be redriven multiple times. When the command times-out, it returns the lock table entry number of the first lock table entry that has not yet been scanned. When a lock held by the terminated or disconnected user is found, a different coupling facility command is used to free the lock held by the terminated or disconnected user. This procedure continues until the entire lock table has been scanned and cleaned up by each and every surviving user. The process of cleaning up the locks held by the terminated or disconnected user can take a minute or more, since the lock table, typically, contains millions of locks.
While the cleaning up process is being performed, the lock structure is in a quiesced state, such that lock requests against the structure from the surviving users cannot be processed and lock requesters are made to wait. The data protected by the locks is therefore unavailable to the surviving users for the duration of the cleanup. Many customer applications have response time goals for the transactions they process, and these goals may not be achievable because of the long duration of the cleanup process. The customer perception of this is that failure of a user causes a temporary case of sysplex-wide sympathy sickness during which data is not available and response time goals are not met.
The above-described cleanup process may result in a significant amount of redundant scanning, because the lock table is scanned by each surviving user to ensure coverage of the cleanup by some user. Furthermore, its one-entry-at-a-time approach to cleanup processing may result in a huge number of coupling facility commands being performed. This is due to the fact that every time a matching lock table entry is returned, two coupling facility commands are performed: one to free the lock that was found to be held by the terminated or disconnected user, and another to restart the scan process starting with the next entry. Thus, the overhead of a large number of coupling facility commands slows the overall cleanup process down significantly.
Based on the foregoing, a need exists for a cleanup process capability that does not require each surviving user to scan the entire lock table. In addition, a cleanup process capability is needed that requires significantly fewer coupling facility accesses. A further need exists for a cleanup capability that reduces the response time impact that is visible to customers and to applications. A yet further need exists for a cleanup capability that enhances system performance.