1. Technical Field
The present invention relates generally to a system, and computer program product for managing a data processing environment. More particularly, the present invention relates to a system, and computer program product for tolerating failures using concurrency in a cluster.
2. Description of the Related Art
Data processing systems can be divided into logical partitions (LPARs). Such data processing systems are also known as logically partitioned data processing systems. A logically partition is also known simply as a “partition.” Each partition operates as a separate data processing system independent of the other partitions. Generally, a partition management firmware component connects the various partitions and provides the network connectivity among them. A Hypervisor is an example of such partition management firmware.
Workload partition is a technology that allows separating users and applications by employing software techniques instead of forming separate hardware partitions. In other words, a data processing system can be so configured as to allow one or more virtual partitions to operate within the data processing system's operating system. Such a virtual partition is called a workload partition, or WPAR.
A high availability (HA) system is a data processing system configured to ensure a threshold level of operational continuity during a given period. Availability refers to the ability of the users and applications to access the data processing system, whether to submit new work, update or alter existing work, or collect the results of previous work. If a user or application cannot access the system, the system is said to be unavailable. Generally, the term downtime is used to refer to periods when a system is unavailable. HA systems are often employed in business organizations to deliver business critical applications and services.
An HA system can be configured using a one or more physical or logical data processing systems. For example, one HA system may include several standalone physical data processing systems configured to operate in unison. As another example, several logical data processing systems, such as LPARs, may be configured to operate together to form a HA system.
As another example, a combination of one or more WPARs, LPARs, and physical data processing systems may also form a part of a HA system. Such a combination is called a cluster. HA systems or clusters therein may further include other components, systems, or devices. For example, a cluster may include an array of data storage devices, such as a storage area network (SAN). As another example, a HA system or a cluster therein may also include a networking device, such as a switch.