1. Field of the Invention
The invention relates to cluster applications, resource management, and virtualization techniques.
2. Background Art
The power of computer technology, including CPU, memory, storage, and network, has been growing faster than the needs of many applications. Many users of computer systems, and more specifically clusters, place a single application on that system. This already results in vastly under utilized computer systems. People are willing to use one system per application for several reasons:                Security—placing applications on their own systems ensures the isolation of application data and application processing.        Resource Management—the user of the application clearly can see what resources are being used, and system managers can readily assign costs.        Application Fault Isolation—some application failures require that the entire machine be rebooted in order to clear the problem. The placement of applications on their own machines ensures that the failure of one application does not impact other applications.        
The new generation of CPU, memory, storage, and network technology is even more powerful relative to the needs of many computer applications. This will result in computer systems that are mostly idle. Cost factors motivate people to look for ways to better utilize this equipment.
The use of virtualization is increasing. In general, virtualization relates to creating an abstraction layer between software applications and physical resources. There are many approaches to virtualization.
One existing operating system virtualization technique is SOLARIS Containers, available in the SOLARIS operating system from Sun Microsystems, Inc., Santa Clara, Calif. SOLARIS Containers includes several different technologies that are used together to consolidate servers and applications. With server virtualization, applications can be consolidated onto a fewer number of servers. For example, multiple virtual servers may exist on a single physical server.
The SOLARIS Containers approach to implementing virtualization involves a technology referred to as SOLARIS zones and a technology referred to as SOLARIS resource pools. Zones are separate environments on a machine that logically isolate applications from each other. Each application receives a dedicated namespace. Put another way, a zone is a type of sandbox. A resource pool is a set of physical resources such as, for example, processors. The SOLARIS pools facility is used to partition the system resources into a plurality of resource pools for the purposes of resource management. The SOLARIS zones facility is for virtualizing the operating system to improve security, provide isolation and administrative delegation.
When consolidating applications with SOLARIS Containers, physical resources are partitioned into a number of resource pools. A zone may be created for each application, and then one or more zones are assigned to each resource pool.
Another technology involved in SOLARIS Containers is called the Fair Share Scheduler (FSS). The Fair Share Scheduler is used when multiple zones are assigned to the same resource pool. The scheduler software enables resources in a resource pool to be allocated proportionally to applications, that is, to the zones that share the same resource pool.
In an existing implementation of SOLARIS Containers, the pools facility is static. That is, the pool configurations must be defined in advance. However, SOLARIS zones are dynamic. There can be many zones defined; the zones may not all be running at a particular time. Zones can be rebooted or even moved to a new host.
In the SOLARIS Containers approach to virtualization, zones and resource pools provide application containment. Within an application container, the application believes that it is running on its own server; however, the kernel and a number of system libraries are shared between the various containers. As well, the physical resources are shared in accordance with the configured resource pools.
FIGS. 1-3 illustrate an existing implementation of SOLARIS Containers, showing how virtualization allows multiple applications and servers to be consolidated onto a single physical server using application containers composed of zones and resource pools. As shown in FIG. 1, a single physical server 10, using server virtualization, allows the consolidation of an email application 12, a first web server 14, and a second web server 16. The single physical server 10 includes multiple virtual servers such that, after consolidation, each of the email application, first web server, and second web server exists on its own virtual server on server 10.
As best shown in FIG. 2, in order to create the application containers, each application has its own zone 22, 24, and 26. FIG. 3 illustrates the completed example including first and second resource pools 30 and 32, respectively. Zones 22, 24, and 26 are non-global zones; the global zone is indicated at 34. Global zone 34 is the original SOLARIS operating system instance.
With continuing reference to FIG. 3, zone 22 has a dedicated resource pool, pool 32. Zone 24, zone 26, and the global zone 34 share resource pool 30. The Fair Share Scheduler (FSS) proportionally allocates resources to zone 24, zone 26, and global zone 34 in accordance with assigned numbers of shares.
As shown, there are four application containers. The first container is composed of zone 22 and resource pool 32. The second container is composed of zone 24 and resource pool 30. The third container is composed of zone 26 and resource pool 30. The fourth container is composed of global zone 34 and resource pool 30.
Sun Microsystems, Inc. introduced SOLARIS zones in the SOLARIS 10 Operating System. In summary, SOLARIS zones provides:                Security—an application or user within a zone can only see and modify data within that zone.        Resource Management—the system administrator can control the allocation of resources at the granularity of the zone. The system administrator can assign specific resources, such as file systems, to a zone. The system administrator can effectively control the percentage of some resources, such as CPU power, allocated to a zone.        Application Fault Isolation—when an application error condition necessitates a reboot, that reboot becomes a zone reboot when the application resides within a zone. The reboot of one zone does not affect any other zone. Hence, the failure of an application in one zone does not impact applications in other zones.        
Many customers are now using zone technology to safely consolidate applications from separate machines onto a single machine. In the existing implementation, zones are limited to a single machine, and do not address the needs of cluster applications. Other existing operating system virtualization technologies also target single machines, and do not address the needs of cluster applications.
Cluster applications are often divided into two categories:                Failover Application—one instance of the application runs on one node at a time. If the machine hosting the application fails, the cluster automatically restarts the application on another node. Failover applications can move between nodes for reasons of load balancing, hardware maintenance, or the whims of the administrator.        Scalable Application—different instances of the application can be running simultaneously on different nodes of the cluster.        
Safely consolidating cluster applications requires keeping these applications separate, while respecting the fact that these applications are spread across multiple machines and these applications will dynamically move between machines.
Many cluster applications require information about the status of potential host machines, in other words these applications need an identification of the machines that are operational.
Background information relating to SOLARIS Containers technology may be found in Joost Pronk van Hoogeveen and Paul Steeves, “SOLARIS 10 How To Guides: Consolidating Servers and Applications with SOLARIS Containers,” 2005, Sun Microsystems, Inc., Santa Clara, Calif.
Further background information may be found in “System Administration Guide: Solaris Containers-Resource Management and Solaris Zones,” Part No.: 817-1592, 2006, Sun Microsystems, Inc., Santa Clara, Calif.
One existing clustering technique is Sun Cluster, available in the SOLARIS operating system from Sun Microsystems, Inc., Santa Clara, Calif.
Background information relating to Sun Cluster technology may be found in Angel Camacho, Lisa Shepherd, and Rita McKissick, “SOLARIS 10 How To Guides: How to Install and Configure a Two-Node Cluster,” 2007, Sun Microsystems, Inc., Santa Clara, Calif.
Further background information may be found in “Sun Cluster System Administration Guide for Solaris OS,” Part No.: 817-6546, 2004, Sun Microsystems, Inc., Santa Clara, Calif.
Further background information may be found in “Sun Cluster Software Installation Guide for Solaris OS,” Part No.: 819-0420, 2005, Sun Microsystems, Inc., Santa Clara, Calif.
Another existing approach to virtualization involves what are referred to as virtual machines. In this approach to virtualization, software running on the host operating system (or in some cases below the host operating system) allows one or more guest operating systems to run on top of the same physical hardware at the same time. In this approach, the guest operating system is a full operating system, including the kernel and libraries. Existing virtual machine technologies support multiple operating system images on a single machine. However, virtual machines, when compared to virtual operating systems, place significant burden on a physical machine and place significant overhead on virtualized resources.