1. Field of the Invention
The present invention relates to high availability in a network architecture and more particularly to inventory management in a clustered computing environment.
2. Description of the Related Art
Computing clusters have become common in the field of high-availability and high-performance computing. Cluster-based systems exhibit three important and fundamental characteristics or properties: reliability, availability and serviceability. Each of these features are of paramount importance when designing the software and the hardware of a new robust clustered system. As opposed to the symmetric multi-processing (SMP) systems whose scalability can be limited and which can result in substantially diminished returns upon the addition of processors to the system, a clustered-based system consists of multiple computing nodes coupled to one another over high-speed communicative linkages. Each node in the cluster enjoys its own memory address space, possibly its own disk space and it hosts its own local operating system. Thus, each computing node within the cluster system can be viewed as a processor-memory module that cooperates with other nodes such that it can provide system resources and services to user applications.
Clusters can be characterized by increased availability since the failure of a particular computing node does not affect the operation of the remaining computing nodes. Rather, any one failed computing node can be isolated and no longer utilized by the cluster-based system until the node can be repaired and incorporated again within the cluster. Additionally, the load of a failed computing node within a cluster can be equitably shared among the functional nodes of the cluster. Thus, clusters have proven to be a sensible architecture for deploying applications in the distributed environment and clusters are now the platform of choice in scalable, high-performance computing.
When a cluster of computing nodes is configured for high availability, the network configuration for each node ordinarily includes a boot IP address and a service IP address. The boot IP address refers to the network interface of the computing node at which the computing node is accessed. The service IP address, in turn, refers to the IP address at which a service executing within the computing node can be accessed. In the latter instance, then, the service IP address acts as an alias to a boot IP address of one of the computing nodes in the cluster supporting the current execution of the service. As such, in a failover condition of a computing node supporting the current execution of the service, the service IP of the service can be changed to alias a different computing node in the cluster that has not failed.
When first establishing a highly available computing environment, oftentimes the centralized management platform charged with managing the computing environment takes “inventory” of the different computing systems intended to form a computing cluster. The inventory generated by the management platform, however, occurs prior to the initialization and execution of the clustered computing application charged with creating and configuring a cluster of nodes utilizing the different computing systems managed by the platform. Consequently, the inventory lacking the relevant boot IP and service IP addresses for the computing nodes will be incomplete requiring manual intervention by a human. Likewise, after a failover condition has arisen in the clustered computing environment, the inventory of the management platform will be incorrect with respect to the service IP addresses that will have changed as a result of the failover condition to alias a different boot IP address of a different node.
In both instances, tedious manual intervention will be required. Alternatively, the management platform can be configured to periodically update the inventory, however, each update consumes precious computing resources and will not be required absent a failover condition. The periodicity of such periodic updating of the inventory can be tuned to minimize the unnecessary consumption of processing resources, however, to do so creates the risk that a period of time will subsist during which the inventory will be incorrect subsequent to a failover condition.