The Internet has evolved into a ubiquitous network that has inspired many companies to rely upon it as a major resource for doing business. For example, many businesses may utilize the Internet, and similar networking infrastructures, to manage critical applications, access content servers, automate assembly and production lines, and implement complex control systems. Such reliance by businesses has driven the demand for higher protection and availability guarantees to resources over the network.
In response to the need for a networking infrastructure that provides both high availability of system resources and protection from failures, a cluster architecture was developed. A cluster can be defined as multiple loosely coupled network devices that cooperate to provide client devices access to a set of services, resources, and the like, over the network. A cluster is configured such that in many respects they can be viewed as though they are a single computer to client devices.
A variety of different types of clusters have evolved, including high availability (HA) clusters, high performance clusters, load balanced clusters, and the like. Examples of clustering systems include the Veritas™ Cluster Server, HP Serviceguard, and/or Microsoft Cluster Server. High Availability clusters are a class of coupled distributed systems that provide high availability for applications typically by using hardware redundancy to recover from single points of failure. HA clusters typically include multiple nodes that interact with each other to provide users with various applications and system resources as a single entity. Each node typically runs a local operating system kernel.
In a typical cluster at least one of the nodes is designated as a master (or coordinating) node of the cluster, while the other nodes are typically known as members (or sometimes, slaves) of the cluster. In a typical cluster, the master node is configured to manage scheduling of tasks to members within the cluster, coordinate membership to the cluster, and related network management issues. The members of the cluster, sometimes known as slaves, are configured typically to perform scheduled tasks for a client device.
Selection of the master when a cluster is initially formed or when an existing master node within a cluster fails is of major concern to a cluster architecture. However, such selection of the master often remains complex and time-consuming, resulting in lost time and money while the cluster becomes useful to client devices. Traditional static approaches to initially forming or re-establishing a cluster often suffer from a single point of failure, and may require human intervention to maintain. Traditional dynamic approaches may be difficult to achieve due to burdens often imposed by them on network resources or a lack of universal uniqueness on the network. Thus, it is with respect to these considerations, and others, that the present invention has been made.