1. Field of the Invention
Embodiments of the present invention relate to the field of networked computing. Specifically, embodiments of the present invention relate to distributed application environment deployment in a multi-computer system.
2. Background
Over the years, as the internet has expanded and computers have multiplied, the need for clustered computing such as High Performance Computing (HPC) has increased. Clustered computing involves multiple compute nodes, usually a server grid, that work together to achieve a common task. For example, several (typically hundreds of) compute nodes may be clustered together to share the load of serving a high-traffic website. In large-scale systems such as this, a trend in software deployment is to centralize data management on a globally accessible file system with stateless computing nodes. A common example of this is Operating System (OS) software image management, where the compute nodes are activated with the distributed application environment by either diskless booting protocols or remote software installation to local storage. Under this architecture, a boot image is required for each compute node in the cluster. The boot image necessarily contains the kernel; it may additionally contain the application software that is intended to be run on the compute node.
The primary concern in clustered computing is low cluster bring-up time. The software that provides the boot images for the cluster typically stores a master boot image. It may then either pre-create clones of this master image for each such server, or it may create them “on the fly.”
Creating a boot image on the fly involves copying the entire contents of the master image, which are typically in the range of 5-15 GB. Even with a significant amount of bandwidth by today's standards, this method will result in a large bring-up time.
Pre-creating a boot image for each server is advantageous from the point of view of cluster bring-up time. However, since one often does not know in advance how many servers will ever be booted, this scheme may result in wasted disk space.
Regardless of which of the preceding methods is used, both suffer from the same major problem—updating the boot image(s) for the cluster is cumbersome, as it means updating a number of copies of the boot image.
Additionally, once some compute nodes have booted, they will often engage in redundant activities with respect to each other. For example, assume that a cluster involves 20 compute nodes is each running the same operating system and using substantially similar hardware. Between the 20 compute nodes, there will be a great deal of commonality between their boot images (e.g., common operating system files, common drivers, common library files, common applications, etc.). Moreover, when each of the 20 compute nodes run virus scans (e.g., weekly) on their images, a large portion of the data scanned will be the same from one compute node to the next. Thus, to the extent that there is redundancy in the operations of the compute nodes (such as virus scanning), CPU resources, disk space, and data bus bandwidth are wasted.
A Branching Store File System, as described in patent application Ser. No. 11/026,622 entitled “BRANCING STORE FILE SYSTEM” filed Dec. 30, 2004, pending, and assigned to the assignee hereof, was developed as a solution to the boot image update problem. In a branching store file system, a read-only base image (or “root” image) of the application environment is created. The root image is accessible by all compute nodes in the cluster. Changes made by a compute node to the root image are stored in a “leaf” image unique to that compute node. A filter operates between the compute nodes and the file system(s), which merges the changes recorded on the leaf images with the root image and delivers the result to the appropriate compute node. From the point of view of the compute node, it is running its own unique and cohesive instance of the application environment. While this system allows for creation of boot images on the fly without severely diminishing bring-up time, a separate version of the system must be created for each unique operating system. Thus, migrating a computing cluster from one operating system to another is much more complicated than simply installing a new root image containing the new OS.