“Clustering” generally refers to a computer system organization where multiple computers, or nodes, are networked together to cooperatively perform computer tasks. Clustering is often used in relatively large multi-user computer systems where high performance and reliability are of concern. For example, clustering may be used to provide redundancy, or fault tolerance, so that, should any node in a cluster fail, the operations previously performed by that node will be handled by other nodes in the cluster. Clustering is also used to increase overall performance, since multiple nodes can often handle a larger number of tasks in parallel than a single computer otherwise could. Often, load balancing can also be used to ensure that tasks are distributed fairly among nodes to prevent individual nodes from becoming overloaded and therefore maximize overall system performance. One specific application of clustering, for example, is in providing multi-user access to a shared resource such as a database or a storage device, since multiple nodes can handle a comparatively large number of user access requests, and since the shared resource is typically still available to users even upon the failure of any given node in the cluster.
To further enhance system availability, it would be desirable in many clustered computer systems to also incorporate the concept of “switchable” hardware resources that are capable of being managed, or functionally “owned” by different nodes at different times, so that access to a particular hardware resource can be maintained even in the event of a failure or shutdown of a node that principally manages the operation of such a hardware resource. In many clustering environments, for example, resources are required to be owned or managed by only one node at a time, irrespective of whether such resources are shareable from an access standpoint.
For example, in the AS/400 or iSeries eServer clustering environment available from International Business Machines Corporation, it may be desirable to define cluster resource groups (CRG's) that manage cluster resources such as direct access storage devices (DASD's) and other hardware components. CRG's support the ability to define primary and backup nodes through which resource management is performed, such that, in response to a shutdown or failure in the primary node, the backup node will automatically assume management of a resource that was previously being managed by the primary node.
In order to effectively switch over hardware resources, however, certain information about such hardware resources typically must be known by those nodes in a clustered computer system that are capable of managing such resources. For example, in an AS/400 or iSeries eServer midrange computer such as that used in the aforementioned clustering environment, the Input/Output (I/O) infrastructure of each computer typically maintains configuration data for each logical and physical hardware entity accessible by that computer. Whenever a switchable resource is being managed by a computer that functions as a node in a clustered computer system, therefore, configuration data for that switchable resource must be maintained within that computer.
In traditional non-clustered environments the logical and physical hardware entities represented in the I/O infrastructure of a computer are all under the domain of that computer, i.e., the entities are all interfaced directly with and controlled by a single computer. However, when clustering is introduced, a difficulty arises as to obtaining configuration data for resources that are outside of the domain of a particular computer, e.g., when that configuration data is only available from certain entities in the system.
From the perspective of inter-node communication, many clustered computer environments require that configuration data regarding the other nodes in a clustered computer system be represented within the I/O infrastructure of each node. For example, configuration data regarding input/output (I/O) adaptors that physically couple nodes together over a communication network may be maintained in a node for the purpose of establishing a logical communication channel between two nodes and thereafter directing communications over the channel. Automated functionality is typically provided in such clustering environments to distribute such configuration data among the various nodes, e.g., during initial startup of a cluster or whenever a new node is added to a cluster.
For switchable hardware resources, however, distribution of configuration data is not as straightforward. In particular, in many environments, the configuration data for a switchable resource may only be accessible from a node that has a particular relationship with that resource, e.g., due to the node's functional ownership or other controlling relationship over the resource. However, given that nodes in a cluster may come and go dynamically, the configuration data for a particular resource may not always be remotely accessible from the appropriate node. Keeping a current copy of the configuration data for a particular switchable resource on each node capable of managing that resource is thus important to ensuring the continued availability of the resource.
Therefore, a significant need exists in the art for a manner of managing switchable resources in a clustered computer environment, and in particular, a manner of distributing configuration data associated with a switchable resource to the nodes capable of managing the resource.