1. Field of the Invention
This invention relates to operation of automated data processing equipment and, more specifically, to failover processing of automated data processing equipment which utilizes redundant processors and resources.
2. Description of Related Art
Automated data processing applications often execute on processing systems that have multiple processors. These multiple processors are frequently organized as groups and all of the members or nodes within the group operate in a cooperative manner. An example of a tightly integrated group of processors is a multiple processor computing cluster. One or more of these processors within a group can be referred to as a “node,” where a node is defined as one or more processors that are executing a single operating system image. A node that is part of a group is referred to herein as a member of the group or a member node. The various members within a group are connected by a data communications system that supports data communications among all of the group members.
The members within a group are sometimes divided among different physical locations. A particular member that is part of a physically disperse group generally has direct access to resources, such as data storage devices, printers, and other shared peripheral devices, that are collocated with and electrically connected to that member. The resources that are used in current group operations are referred to as primary resources. These groups many times also maintain redundant resources, referred to as backup resources, that contain duplicates, or mirrors, of the primary resource and that can be quickly configured to become primary resources if required. Maintaining redundant resources in a group avoids single point failures for the group's operation. Computing system groups communicate all data changes in a primary resource to one or more backup resources in order to maintain a consistent mirror of the primary resource at the backup resource.
A group that has a number of members typically defines one member to be the primary member for that group. The primary member is the primary point of access for the group and hosts the primary resources used by the group.
Groups sometimes maintain multiple backup resources, such as backup data storage units, for each primary resource. This further improves reliability and allows for greater geographical dispersion of backup resources. Conventional group processing is configured to efficiently handle substituting, i.e., perform failover processing for, the replacement of the primary member (e.g., computing node) with a backup member. The use of conventional group processing, although useful, is not without its problems.
One problem is the handling of failures of backup members and backup resources. Failures of primary resources generally result in the failover of a primary member to a backup member. However, a failure of a backup member generally results in the loss of backup processing for the group.
Another problem is the failover of a primary resource with mirroring to a backup resource. Often times the failover of a primary resource causes mirroring data routing, which is configured to communicate mirroring data from the failed primary to the one or more backup members, to become obsolete, often requiring manual reconfiguration of the mirroring data processing.
Therefore a need exists to overcome the problems with the prior art as discussed above, and particularly for a way to more efficiently handle failure of resources and backup nodes in group computing environments.