1. Technical Field
This invention generally relates to clustering computers, and more specifically relates to distress signaling for cluster communications.
2. Background Art
Society depends upon computer systems for many types of information in this electronic age. Based upon various combinations of hardware (e.g., semiconductors, circuit boards, etc.) and software (e.g., computer programs), computer systems vary widely in design. Many computer systems today are designed to xe2x80x9cnetworkxe2x80x9d with other computer systems. Through networking, a single computer system can access information stored on and processed by other computer systems. Thus, networking results in greater numbers of computer systems having access to greater numbers of electronic resources.
Networking is made possible by physical xe2x80x9croutesxe2x80x9d between computer systems, and the use of agreed upon communications xe2x80x9cprotocols.xe2x80x9d What protocol is chosen depends upon factors including the number of networked computer systems, the distances separating the computer systems, and the purposes of information exchange between the computer systems. Communications protocols can be very simplistic if only a few computer systems are networked together at close proximity. However, these communications protocols become more sophisticated as greater numbers of computer systems are added, and as computer systems are separated by greater distances.
The sophistication of communications protocols also varies with the type of information exchange. For instance, some protocols emphasize accuracy in sending large amounts of information, while others emphasize the speed of information transfer. The communications requirements of the applications running on a computer system network determine what type of protocol is chosen. An example of a computer application requiring real-time, reliable information transfer is a xe2x80x9cclusterxe2x80x9d management application.
Clustering is the networking of computer systems for the purpose of providing continuous resource availability and for sharing workload. A cluster of computer systems appears as one computer system from a computer system user""s perspective, but actually is a network of computer systems backing each other up. In the event of an overload or failure on one computer system in a cluster, cluster management applications automatically reassign processing responsibilities for the failing computer system to another computer system in the cluster. Thus, from a user""s perspective there is no interruption in the availability of resources.
Typically, one node in the cluster is assigned primary responsibility for an application (e.g., database, server) and other nodes are assigned backup responsibility. When the primary node for an application fails, the back up nodes in the cluster take over responsibility for that application. This ensures the high availability of that application.
Clustering is made possible through cluster management application programs running on each computer system in a cluster. These applications relay cluster messages back and forth across the cluster network to control cluster activities. Cluster messaging is also used to distribute updates about which computer systems in the cluster have what primary and back-up responsibilities.
To ensure the high availability of applications running on the cluster, the cluster needs to be able to keep track of the status of all the nodes on a cluster. To do this, each computer system in a cluster continuously monitors each of the other computer systems in the same cluster to ensure that each is alive and performing the processing assigned to it. Thus, when a node on a cluster fails, its primary responsibilities can be assigned to the backup nodes.
Unfortunately, it is not always possible to tell that a node in the cluster has failed. For example, if the network connection between one node and the rest of the cluster fail, the cluster will no longer be able to tell if that node is operating properly. If a node is still operating but its network connection to other nodes in the cluster has failed, then the node is said to have been xe2x80x9cpartitionedxe2x80x9d from the cluster. When a node unexpectedly stops communicating with the rest of the cluster it cannot be easily determined whether the node has failed or instead has been merely partitioned from the rest of the cluster. If the cluster incorrectly assumes the node has failed, and assigns the backup node primary responsibility for the application, the cluster can will have two nodes both believing that they are the primary node. This can result in data inconsistencies in the database as both nodes respond to requests to the cluster. If on the other hand, the cluster incorrectly assumes the node is still performing its primary applications and has only been partitioned from the cluster, and does not assign primary responsibility to the back up node, then those applications will no longer be available to the clients of the cluster. Thus, in many cases the cluster is unable to correctly respond to a non-communicating node without manual intervention by administrators.
As more resources become accessible across computer system networks, the demand for continuous access to such network resources will grow. The demand for clusters as a means to provide continuous availability to such network resources will grow correspondingly. Without improved methods for determining the status of cluster nodes, the continuous availability these resources will not be fully realized.
According to the present invention, a cluster node distress system is provided that improves the reliability of a cluster. The cluster node distress system provides a cluster node distress signal when a node on the cluster is about to fail. This allows the cluster to better determine whether a non-communicating node has failed or has merely been partitioned from the cluster. The preferred cluster node distress system is embedded deeply into the operating system and provides a pre-built node distress signal that can be quickly sent to other nodes in the cluster when an imminent failure of that node is detected. This improves the probability that the node distress signal will get out before the node totally fails. When the node distress signal is effectively sent to the cluster, the cluster can accurately determine that the node has failed and has not just partitioned from the cluster. This allows the cluster to respond correctly, i.e., by assigning other nodes primary responsibility, and requires less intervention by administrators. Thus, the preferred embodiment provides improved cluster reliability and decreased reliance on administrators.
The foregoing and other features and advantages of the invention will be apparent from the following more particular description as set forth in the preferred embodiments of the invention, and as illustrated in the accompanying drawings.