This invention relates, in general, to data processing within a distributed computing environment, and in particular, to the duplexing of structures, such as structures of one or more coupling facilities.
Some distributed computing environments, such as Parallel Sysplexes, today provide a non-volatile shared storage device called the coupling facility, that includes multiple storage structures of either the cache or list type. These structures provide unique functions for the operating system and middleware products employed for the efficient operation of a Parallel Sysplex. For example, the cache structures provide directory structures and cross-invalidation mechanisms to maintain buffer coherency for multisystem databases, as well as a fast write medium for database updates. These are used by, for instance, the data sharing versions of DB2 and IMS, offered by International Business Machines Corporation, Armonk, N.Y.
The list structures provide many diverse functions. One such list structure function is to provide for high-performance global locking, and this function is exploited by such products as the IMS Resource Lock Manager (IRLM) and the Global Resource Serialization (GRS) function in OS/390, offered by International Business Machines Corporation, Armonk, N.Y. Another list structure function is to provide a message passing mechanism with storage for maintaining multiple messages on a per system basis and a mechanism for notifying a system of the arrival of new messages. This function is exploited by the XCF component of OS/390, which in turn is exploited by numerous multisystem applications for providing a capability to pass messages between their various instances. A third list structure function is to provide for shared queue structures that can be ordered and accessed by LIFO/FIFO ordering, by key, or by name. Workload Manager (WLM), IMS Shared Message Queues and MQ Series, all offered by International Business Machines Corporation, Armonk, N.Y., are examples of exploiters of this feature. While these functions provide examples of the list structure uses, other uses exist.
Various components of a Parallel Sysplex have been documented in numerous applications/patents, which are listed above and hereby incorporated herein by reference in their entirety. The capabilities defined in some of those patents provide the basic system structure to create and manage cache and list structure instances. Additionally, various of the applications/patents listed above provide extensions to the base functions of the Parallel Sysplex.
In many situations, a failure of the coupling facility that contains various structures requires significant recovery actions to be taken by the owning applications. For example, for database caches and queues, this may require using backup log data sets and/or tapes. This is a time-consuming process that results in a loss of access to the application during the recovery operation. Other structures, such as lock tables, may require reconstruction of partial lock tables from in-storage copies, along with failures of in-flight transactions. Still other structures, such as message-passing structures, may lose all data and require re-entry from the application. So, there is a proliferation of diverse recovery schemes with different recovery times and impacts. Moreover, since the failure of a coupling facility results in all resident structures failing, the diverse recovery actions are occurring concurrently, which can cause serious disruptions in the Parallel Sysplex.
Thus, a need exists for a configuration of a Parallel Sysplex that provides less disruptions. In particular, a need exists for a high-availability coupling facility, which improves on the recovery times and impacts of existing recovery techniques, while also provides for a consistent recovery design across various structure types. As a particular example, a need exists for one or more capabilities that facilitate duplexing of structures in separate coupling facilities coupled to one another.
The shortcomings of the prior art are overcome and additional advantages are provided through the provision of a method of resolving potential deadlocks. The method includes, for instance, detecting a potential deadlock between a plurality of commands attempting to latch a plurality of distributed instances of a resource; determining a command of the plurality of commands that has priority; and making available an instance of the plurality of distributed instances of the resource for the command determined to have priority.
In one example, the making available an instance of the plurality of distributed instances of the resource includes suspending execution of one or more commands of the plurality of commands. As one example, the suspending execution of a command of the one or more commands is in response to receiving a request for suppression. In yet a further embodiment, the method further includes executing at least one command of the one or more commands having suspended execution. In one example, the at least one command is executed when a command sequence number associated with the command indicates it has priority.
System and computer program products corresponding to the above-summarized methods are also described and claimed herein.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.