“Clustering” generally refers to a computer system organization where multiple computers, or nodes, are networked together to cooperatively perform computer tasks. An important aspect of a computer cluster is that all of the nodes in the cluster present a single system image—that is, from the perspective of a user, the nodes in a cluster appear collectively as a single computer, or entity.
Clustering is often used in relatively large multi-user computer systems where high performance and reliability are of concern. For example, clustering may be used to provide redundancy, or fault tolerance, so that, should any node in a cluster fail, the operations previously performed by that node will be handled by other nodes in the cluster. Clustering is also used to increase overall performance, since multiple nodes can often handle a larger number of tasks in parallel than a single computer otherwise could. Often, load balancing can also be used to ensure that tasks are distributed fairly among nodes to prevent individual nodes from becoming overloaded and therefore maximize overall system performance. One specific application of clustering, for example, is in providing multi-user access to a shared resource such as a database or a storage device, since multiple nodes can handle a comparatively large number of user access requests, and since the shared resource is typically still available to users even upon the failure of any given node in the cluster.
Clusters typically handle computer tasks through the performance of “jobs” or “processes” within individual nodes. In some instances, jobs being performed by different nodes cooperate with one another to handle a computer task. Such cooperative jobs are typically capable of communicating with one another, and are typically managed in a cluster using a logical entity known as a “group.” A group is typically assigned some form of identifier, and each job in the group is tagged with that identifier to indicate its membership in the group.
Member jobs in a group typically communicate with one another using an ordered message-based scheme, where the specific ordering of messages sent between group members is maintained so that every member sees messages sent by other members in the same order as every other member, thus ensuring synchronization between nodes. Requests for operations to be performed by the members of a group are often referred to as “protocols,” and it is typically through the use of one or more protocols that tasks are cooperatively performed by the members of a group.
While the member jobs in a group utilize ordered messaging to communicate with one another to cooperatively perform tasks, typically a clustered computer system also requires support for entities that are external to a group to send a request to the group to perform various group operations. Conventionally, external access to a group has been supported through assigning a specific network address (e.g., a TCP/IP address) to the group, such that an external entity wishing to access a group can send a request to that specific address. This technique is sometimes called N+1 addressing, where N addresses are assigned to the N nodes in a group, plus one additional address for the group itself.
As with other conventional network addressing protocols, typically a name service is provided in a conventional clustered computer system to map network addresses of groups to “group names”. A name can generally be any form of shorthand identifier or alias for a particular entity that is accessible over a network. An advantage to using a name in lieu of the direct address to access a networked entity is that since a network address assigned to an entity may change from time to time, the entity can always be accessed by the name even if the mapping of the name is modified.
The address of an entity on a network, including that of a cluster node or a group, is typically obtained in a conventional clustered computer system by accessing a network name server such as a directory name service (DNS) server resident on the network. Thus, should an entity desire to access another entity on a network, the accessing entity typically resolves the name of the entity to be accessed through the network name server, and then sends a message to the network address returned by the server. Thus, in the case of an external access to a group, an entity wishing to send a request to the group resolves the group name through the network name server, and sends a message to the group address that is returned by the server.
The use of an external name server in connection with accessing a group presents a number of problems. First, a significant concern is presented that a node or other entity outside of a cluster could send messages to a group that could interfere with the group's operation. Particularly given the security risks presented by viruses, Trojan horses, and other malicious programs, and coupled with the increasing use of the Internet, the ability to access a group simply by accessing a network address associated with that group presents a significant security risk for a clustered computer system.
Second, in many instances, it may be desirable to implement multiple clusters, or cluster “instances”, on a given clustered computer system or network, e.g., in a logically partitioned system where multiple cluster instances may execute concurrently in different logical computer systems that execute on the same physical system. Where multiple clusters exist, however, a limitation is presented in that the same group name cannot exist in each cluster, since a common name server that cannot resolve a group name to different network addresses is often used. Conventionally, clusters can avoid these problems by requiring a separate dedicated Local Area Network (LAN) for each cluster, and by prohibiting any cluster from spanning subnets. However, it is often desirable to implement a clustered computer system in a wide variety of network topologies, including geographically-disbursed implementations where nodes may be interconnected with one another over large distances, and implementations where nodes are coupled over a public network such as the Internet. Consequently, restricting a cluster to a dedicated LAN is not desirable in many circumstances.
Therefore, a significant need exists in the art for manner of supporting external accesses to groups resident in a clustered computer system, and in particular a mechanism for supporting external access to groups that is capable of limiting access only to authorized entities.