Distributed computing systems have found application in a number of different computing environments, particularly those requiring high performance and/or high availability and fault tolerance. In a distributed computing system, multiple computers connected by a network are permitted to communicate and/or share workload. Distributed computing systems support practically all types of computing models, including peer-to-peer and client-server computing.
One particular type of distributed computing system is referred to as a clustered computing system. “Clustering” generally refers to a computer system organization where multiple computers, or nodes, are networked together to cooperatively perform computer tasks. An important aspect of a computer cluster is that all of the nodes in the cluster present a single system image—that is, from the perspective of a client or user, the nodes in a cluster appear collectively as a single computer, or entity. In a client-server computing model, for example, the nodes of a cluster collectively appear as a single server to any clients that attempt to access the cluster.
Clustering is often used in relatively large multi-user computing systems where high performance and reliability are of concern. For example, clustering may be used to provide redundancy, or fault tolerance, so that, should any node in a cluster fail, the operations previously performed by that node will be handled by other nodes in the cluster. Clustering is also used to increase overall performance, since multiple nodes can often handle a larger number of tasks in parallel than a single computer otherwise could. Often, load balancing can also be used to ensure that tasks are distributed fairly among nodes to prevent individual nodes from becoming overloaded and therefore maximize overall system performance. One specific application of clustering, for example, is in providing multi-user access to a shared resource such as a database or a storage device, since multiple nodes can handle a comparatively large number of user access requests, and since the shared resource is typically still available to users even upon the failure of any given node in the cluster.
In many clustered computer systems, the services offered by such systems are implemented as managed resources. Some services, for example, may be singleton services, which are handled at any given time by one particular node, with automatic failover used to move a service to another node whenever the node currently hosting the service encounters a problem. Other services, often referred to as distributed services, enable multiple nodes to provide a service, e.g., to handle requests for a particular type of service from multiple clients.
Resources such as cluster-provided services are typically managed through the use of various types of policies that are necessary for some aspect of a resource's existence. A policy, in general, is any set of rules that may be used to manage the existence and operation of one or more resources, and includes, for example, activation or high availability policies, security policies, rights policies, and other types of management policies. An activation policy may be used, for example, to select a particular node or nodes to use to host a service, and/or to manage how failover occurs in response to a node failure. A security policy may be used, for example, to determine what resources particular users are permitted to access and/or what types of operations those users are permitted to perform. A rights policy may be used, for example, to control access to digital content.
Many conventional policy-based resource management systems require each managed resource to be associated with one policy of a particular type. For example, in a high availability environment, it is often a requirement that each managed resource be controlled by a single activation policy. Otherwise, the applicability of multiple policies to a given managed resource could introduce conflicts between different policies.
Furthermore, many conventional-policy based resource management systems also require the converse relationship—that each policy of a particular type apply to a single managed resource. Put another way, there is a one/one mapping between each managed resource and each policy. As such, every time a new managed resource is created or otherwise added to the system, a new policy must be created for that resource.
By requiring a one/one mapping between resources and policies, however, policy management can become complex and unwieldy as more and more managed resources are added. In many high availability environments, for example, thousands of managed resources may exist, thus requiring a comparable number of policies. While policies may be automatically generated in some circumstances when managed resources are added, nonetheless, the large number of policies that are generated can become an administrative nightmare whenever it is desirable to modify or remove policies. The large number of policies can also constrain system scalability.
Therefore, a significant need continues to exist in the art for a manner of associating policies with managed resources with reduced management overhead and complexity, and improved flexibility and scalability.