Clusters are groups of computers that use groups of redundant computing resources in order to provide continued service when individual system components fail. More specifically, clusters eliminate single points of failure by providing multiple servers, multiple network connections, redundant data storage, etc. Clustering systems are often combined with storage management products that provide additional useful features, such as journaling file systems, logical volume management, multipath input/output (I/O) functionality, etc.
In a high-availability clustering system, the failure of a server (or of a specific computing resource used thereby such as a network adapter, storage device, etc.) is detected, and the application that was being run on the failed server is automatically restarted on another computing system. This process is called “failover.” The high-availability clustering system can also detect the failure of the application itself, and failover the application to another node. In effect, the high-availability clustering system monitors applications, the servers the applications run on, and the resources used by the applications, to ensure that the applications remain highly available.
Virtualization of computing devices can be employed in high availability clustering and in other contexts. One or more virtual machines (VMs or guests) can be instantiated at a software level on physical computers (host computers or hosts), such that each VM runs its own operating system instance. Just as software applications, including server applications such as databases, enterprise management solutions and e-commerce websites, can be run on physical computers, so too can these applications be run on virtual machines. VMs can be deployed such that applications being monitored by the high-availability clustering system run on and are failed over between VMs, as opposed to physical servers. In order to provide an application with high availability in a cloud environment, the application can be run on a virtual machine which is in turn running on a high-availability cluster. The virtual machine provides the desired mobility and isolation of the application, whereas the underlying high-availability cluster provides the highly available computing infrastructure.
For these reasons, enterprises and other organizations that require high availability for their applications such as databases, enterprise management solutions and e-commerce websites often enter into service level agreements with a high availability cluster provider to host their applications and guarantee a specific level of availability. In these cases, the high-availability cluster provides the underlying infrastructure from which to serve applications to organizational customers over a network (e.g., as a cloud service), where the organizational customer requires high availability of the application, either to make it available to its own customers (e.g., an e-commerce web service) or for internal organizational use (e.g., a critical database application).
At the level of the high-availability cluster, a specific application that is being made highly available to an organizational customer is associated with a logical grouping of associated hardware and software resources and underlying infrastructure. Using an example scenario of an instance of an Oracle database application being made highly available to a given enterprise, the group of high-availability cluster level resources could include the instance of the database application itself and associated code libraries, a given VM the application executes on, a share of the processing resources of the physical host the VM executes on, virtual network resources of the VM which are in turn mapped to underlying physical network resources (e.g., the network card(s) used to export the database application service, one or more IP addresses associated with the network cards, etc.), a database whose table spaces are files, the virtual file system of the VM, a mount point to underlying storage media which may itself be logical or physical (e.g., disk groups on which the data is stored, a logical volume built in the disk group and a file system using the volume), physical storage resources allocated to the application which may be distributed across various physical media and/or sites with various levels of redundancy, additional processing and network infrastructure guaranteed to be available for the application according to the service agreement and the various relationships and dependencies between these components, including protocols for starting, stopping restarting and monitoring the application.
At the level of the high-availability clustering system, such groupings of resources need to be identified, configured and maintained in order to provide specific application services to organizational customers at agreed levels. Such a group of resources can be thought of as a high-availability cluster level operational representation of the application. These operational representations of applications are present in various clustering technologies. Different clustering products use different terminology to refer to such groupings. For example, these operational representations are termed service groups in Veritas Cluster Server, whereas in Microsoft Cluster Server they are called resource groups.
As useful as they are at the high-availability cluster level, such operational representations do not give visibility into the actual service provided by the application, which is how consumers of the application conceive of and interact with it. In other words, consumers of application services (e.g., organizational customers contracting with high-availability cluster providers) do not identify the application by its operational representation, but instead by the service it provides and its connection endpoints. The operational representation of an application does not provide visibility into the actual service provided, which causes disconnect for the consumer in identifying the multiple tiers and components of the provided business service, which is a very critical aspect of information technology management and continuity. Conventionally, these important identifications are made at the IT administrative level manually, and made available at the business service level using listing information such as a service catalog.
It would be desirable to address this issue.