A clustered computing system is a collection of interconnected computing elements that provide processing to a set of client applications and, to a large extent, can be viewed as though the computing elements are a single computer. Each of the computing elements is referred to as a node. A node may be a computer interconnected to other computers, or a server blade interconnected to other server blades in a grid. A group of nodes in a clustered computing system that have shared access to storage (e.g., have shared disk access to a set of disk drives or other non-volatile storage) and that are connected via interconnects is referred to herein as a cluster.
A clustered computing system is used to host clustered servers. Resources from multiple nodes in a clustered computing system can be allocated to running a server's software. Each allocation of the resources of a particular node for the server is referred to herein as a server instance, or simply an instance. A database server can be clustered, where the server instances may be collectively referred to as a cluster. Each instance of a database server facilitates access to the same database, in which the integrity of the data is managed by a global lock manager. The collection of server instances, and the resources used by the servers, are typically managed by a “clusterware” software application.
FIG. 1 is a block diagram that illustrates a two-node clustered computing system. Clusterware 102 is software that allows clusters of networked computers, such as Node A and Node B, to operate or be controlled as if they are one. Clusterware 102 operates between two or more nodes in a group of computers, typically at a layer just on top of the operating system 104. One function of clusterware 102 is to manage applications 106 running on the cluster nodes, including the cluster resources used by the various applications 106 running on the cluster. Some typical behavioral goals of clusterware are, for example, to ensure high availability fail-over processing within the cluster and to balance the workload across the nodes in the cluster. Various events may change the management behavior of the clusterware relative to certain cluster applications and/or cluster resources. For example, a change in the value of a resource's attribute could change the manner in which the clusterware manages and otherwise handles that resource, e.g., in the event of a node and/or cluster crash.
Resources managed by clusterware 102 can have certain attributes that may need to be changed at any point in time, and where values for such attributes may not be known at the time of configuration of the clusterware. If an attribute of an online resource needs to be modified, the system administrator has to stop the resource, modify the particular attribute and start the resource again. Thus, this approach involves stopping the resource and all the other resources that may depend on the resource, which may potentially lead to relatively long periods of outage time for numerous resources. Therefore, this approach negates the objective of a high-availability system to always provide availability of the cluster resources that the clusterware 102 manages.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.