Distributed computing systems are an increasingly important part of research, governmental, and enterprise computing systems. Among the advantages of such computing systems are their ability to handle a variety of different computing scenarios including large computational problems, high volume data processing situations, and high availability situations. For applications that require the computer system to be highly available, e.g., the ability to maintain the system while still providing services to system users, a cluster of computer systems is a useful implementation of the distributed computing model. In the most general sense, a cluster is a distributed computer system that works together as a single entity to cooperatively provide processing power and mass storage resources. With a cluster, the processing load of the computer system is typically spread over more than one computer, thereby eliminating single points of failure. Consequently, programs executing on the cluster can continue to function despite a problem with one computer in the cluster. In another example, one or more computers of the cluster can be ready for use in the event that another computer in the cluster fails. While each computer in a cluster typically executes an independent instance of an operating system, additional clustering software is executed on each computer in the cluster to facilitate communication and desired cluster behavior.
FIG. 1 illustrates a simplified example of a computing system 100 that is not operating as a cluster. The members of the computing system 100 include host 130 and host 140. As members of computing system 100, hosts 130 and 140, are typically individual computer systems having some or all of the software and hardware components illustrated. For example, hosts 130 and 140 operate as servers and as such include well known hardware and software components such as: file system software, storage virtualization software (volume management software), an operating system, various drivers for the server's hardware, and platform hardware such as various interface devices. As servers, hosts 130 and 140 also include server applications (132 and 142) such as communication and collaboration servers (e.g., e-mail servers), web servers, file servers, database management systems (DBMS), media servers, enterprise application servers, and the like. When operating on a server computer system, such server applications typically utilize some amount of local storage (134 and 144) for server application configuration information, software components, operational information, and data. FIG. 7 (described below) illustrates some of the features common to such computer systems, and those having ordinary skill in the art will understand the wide variation in hardware and software components of such systems.
In support of various applications and operations, hosts 130 and 140 can exchange data over, for example, network 120, typically a local area network (LAN), e.g., an enterprise-wide intranet, or a wide area network (WAN) such as the Internet. Additionally, network 120 provides a communication path for various client computer systems 110 to communicate with hosts 130 and 140. In addition to network 120, hosts 130 and 140 can communicate with each other over a private network (not shown).
Other elements of computing system 100 include storage area network (SAN) 150 and storage devices such as tape library 160 (typically including one or more tape drives), a group of disk drives 170 (i.e., “just a bunch of disks” or “JBOD”), and intelligent storage array 180. As shown in FIG. 1, both hosts 130 and 140 are coupled to SAN 150. Such coupling can be through single or multiple paths. SAN 150 is conventionally a high-speed network that allows the establishment of direct connections between storage devices 160, 170, and 180 and hosts 130 and 140. SAN 150 can be implemented using a variety of different technologies including SCSI, fibre channel arbitrated loop (FCAL), fibre channel switched fabric, IP networks (e.g., iSCSI), Infiniband, etc. SAN 150 can also include one or more SAN specific devices such as SAN switches, SAN routers, SAN hubs, or some type of storage appliance. Thus, SAN 150 is shared between the hosts and allows for the sharing of storage devices between the hosts to provide greater availability and reliability of storage. Although hosts 130 and 140 are shown connected to storage devices 160, 170, and 180 through SAN 150, this need not be the case. Shared resources can be directly connected to some or all of the hosts in the computing system, and computing system 100 need not include a SAN. Other storage schemes include the use of shared direct-attached storage (DAS) over shared SCSI buses. Alternatively, hosts 130 and 140 can be connected to multiple SANs.
Hosts 130 and 140 can be designed to operate completely independently of each other as shown, or may interoperate to form some manner of cluster. As members of a cluster, servers or hosts are often referred to as “nodes.” Thus, a node in a computer cluster is typically an individual computer system having some or all of the software and hardware components illustrated and as is well known in the art.
In order to operate servers such as servers 130 and 140 as a cluster, both the underlying system software (e.g., operating system, file system, volume management, server-to-server communication) and any server applications operating on the servers must be designed and/or configured to operate in a cluster. Installing such systems typically occurs from the ground up, i.e., first basic system software is installed, then system software needed to support clustering operations, and finally cluster aware and/or compatible server application software is installed. In many cases, cluster aware and/or compatible server application software is specifically designed to operate only in particular (typically proprietary) cluster environments. For example, Microsoft Exchange 2000 Server (a common e-mail server application) is designed to support cluster operation only in conjunction with clustering services provided by the Microsoft Windows 2000 Server operating system.
However, there are many instances where it is desirable to transform an existing standalone server application installation (such as those illustrated in FIG. 1) into a clustered installation. The primary advantage to such a transformation is that it obviates the need to undergo the laborious process of saving application data and configuration information, uninstalling the server application, upgrading/reinstalling/activating clustering system software, reinstalling the server application, and reconfiguring the server application. Moreover, it is desirable to be able to implement such server applications in clustering environments for which they were not necessarily designed and to extend to these server applications clustering functionality which might not otherwise be available in the clustering systems with which they have been designed to operate.