1. Field of the Invention
This invention relates to clustered computer systems. More particularly, this invention relates to efficiencies in the deployment of applications on a new node of a cluster.
2. Description of the Related Art
Cluster technology now supports computer system availability, and has become indispensable to large business operations. A cluster is a collection of one or more complete systems that cooperate to provide a single, unified computing capability. Typically, the cluster members are interconnected by a data network, such as an IP network. From the perspective of the end user, the cluster operates as though it were a single system. Clusters provide redundancy, such that any single outage, whether planned or unplanned, in the cluster does not disrupt services provided to the end user. End user services can be distributed among the cluster members, and are automatically relocated from system to system within the cluster in a relatively transparent fashion by a cluster engine, in accordance with policies established by the cluster resource group management.
The process of bringing up a new node into an existing cluster is time consuming. Classically, an entire image would be installed onto the node, or at least cloned from an existing node. Disk cloning products, such as Ghost™, available from Symantec Corporation, 20330 Stevens Creek Blvd. adopts this approach. However in the dynamic environment of web server applications, this is too time consuming to be practical.
The RPM product, available from Red Hat, Inc., 2600 Meridian Parkway, Durham, N.C. 27713, reduces the complexity of application installation by packing applications with installation scripts and a list of dependencies. The RPM utility performs the dependencies check, unpacks applications, and runs the installation scripts.
Both of the above approaches assume that applications and data are installed on each machine of the cluster as independent copies. This requirement has a serious drawback, as content management then becomes difficult, requiring the maintenance of many copies of the same data to assure coherence of data.
It is possible to employ a shared file system to store applications and data in order to reduce the application priming time. Ideally, one symbolic link to a subdirectory of the shared file system would be sufficient to fully enable applications on the new node. However, this technique is effective only for applications that do not use any files located outside application-specific directories. Many applications employ files that are located in system directories, for example the directory /etc. Such applications would require multiple symbolic links, which would be cumbersome and time consuming to establish. Furthermore, symbolic links can be used only for certain types of files. They are inapplicable, for example, to directories that are created during installation, and are used only for local data. For example, the installation of the Apache web server, available from Red Hat, Inc., requires a directory /var/log/http, in which each node of the cluster is meant to keep a local log of http activity. Creating such a directory on a shared file system is problematic, since different instances of the application will overwrite its files.