1. Field of the Invention
The present invention relates generally to an improved data processing system, and in particular, to a computer implemented method for managing system disruption in data processing systems. Still more particularly, the present invention relates to a computer implemented method, system, and computer usable program code for predictively managing failover in high availability data processing systems.
2. Description of the Related Art
Data processing systems can be configured in a variety of ways. For example, the components in a data processing system may be configured to operate in a manner such that the data processing system behaves as a single data processing unit. The memory in such a configuration operates to support data manipulation for the single data processing unit.
As another example, data processing systems can be divided into logical partitions (LPARs). Such data processing systems are also known as logical partitioned data processing systems. A logical partition is also known simply as a “partition.” Each partition operates as a separate data processing system independent of the other partitions. Generally, a partition management firmware component connects the various partitions and provides the network connectivity among them. A Hypervisor is an example of such partition management firmware.
Workload partition is a technology that allows separating users and applications by employing software techniques instead of forming separate hardware partitions. In other words, a data processing system can be so configured as to allow one or more virtual partitions to operate within the data processing system's operating system. Such a virtual partition is called a workload partition, or WPAR.
A WPAR shares the operating system and resources of the host data processing system. Resources accessible to the operating system of the host data processing system are said to belong to a “global space”. Conversely, a resource in the global space can be accessed by the operating system of the host data processing system. One or more WPARs can be configured in a data processing system, such as a LPAR.
A high availability (HA) system is a data processing system configured to ensure a threshold level of operational continuity during a given period. Availability refers to the ability of the users and applications to access the data processing system, whether to submit new work, update or alter existing work, or collect the results of previous work. If a user or application cannot access the system, the system is said to be unavailable. Generally, the term downtime is used to refer to periods when a system is unavailable. HA systems are often employed in business organizations to deliver business critical applications and services.
An HA system can be configured using a one or more physical or logical data processing systems. For example, one HA system may include several standalone physical data processing systems configured to operate in unison. As another example, several logical data processing systems, such as LPARs, may be configured to operate together to form a HA system.
As another example, a combination of one or more WPARs, LPARs, and physical data processing systems may also form a part of a HA system. Such a combination is called a cluster. HA systems or clusters therein may further include other components, systems, or devices. For example, a cluster may include an array of data storage devices, such as a storage area network (SAN). As another example, a HA system or a cluster therein may also include a networking device, such as a switch.