1. Field of the Invention
The present invention relates to cluster based computing and more particularly to command propagation in a computing cluster.
2. Description of the Related Art
Computing clusters have become common in the field of high-availability and high-performance computing. Cluster-based systems exhibit three important and fundamental characteristics or properties: reliability, availability and serviceability. Each of these features is of paramount importance when designing the software and the hardware of a new robust clustered system. As opposed to the symmetric multi-processing (SMP) systems whose scalability can be limited and which can result in substantially diminished returns upon the addition of processors to the system, a clustered-based system consists of multiple computers that are connected over high-speed network communicative linkages.
Each computer in a cluster enjoys its own memory, possibly its own disk space and it hosts its own local operating system. Each node within the cluster system can be viewed as a processor-memory module that cooperates with other nodes such that it can provide system resources and services to user applications. Nodes in a cluster system, however, are not limited to a physical computing system. Rather, nodes in a cluster system also can include virtual machines operating in a physical host environment.
Clusters can be characterized by increased availability since the failure of a particular node does not affect the operation of the remaining nodes. Rather, any one failed node can be isolated and no longer utilized by the cluster-based system until the node can be repaired and incorporated again within the cluster. Additionally, the load of a failed node within a cluster can be equitably shared among the functional nodes of the cluster. Thus, clusters have proven to be a sensible architecture for deploying applications in the distributed environment and clusters are now the platform of choice in scalable, high-performance computing.
Individual computing nodes of a cluster can be managed at the command line or programmatically. Generally, commands are issued from a command source within a process execution space and can be directed to one or more targeted nodes. Typical commands include node reboot, update, launch and shut down to name a few possible commands. In some instances, the same command can be directed to multiple different nodes within a cluster.
When the same command is directed to multiple different nodes within a cluster, the sequence in which the commands are individually transmitted is arbitrary. However, depending upon network conditions, those commands may be received in each of the targeted nodes at different times. In the case of high network latency for some of those nodes, the gap in time of receipt of a command from a node receiving the command soonest and the node receiving the command latest can be intolerable and can result in an unwanted system condition.