1. Field of the Invention
The invention disclosed and claimed herein generally pertains to a method and apparatus wherein a hypervisor is linked to one or more other hypervisors to form a high availability (HA) cluster. More particularly, the invention pertains to a method and apparatus of the above type wherein each hypervisor may enable multiple guest operating systems, or guest virtual machines (VMs), to run concurrently on a host computing platform.
2. Description of the Related Art
Certain virtualization management products maintain the availability of guest VMs by including or embedding an HA cluster product in their product offerings. Typically, these products work by forming the underlying hypervisors, which each runs on a physical machine, into a high availability cluster. Heartbeating is then performed between the hypervisors. When a member of the cluster fails the heartbeat, either due to hypervisor failure or physical server failure, embedded HA clustering technology restarts the guest VMs on alternate servers, thus maintaining their availability.
This approach has several limitations. For example, it does not detect and recover from the failure of the guest VM systems themselves, beyond a full crash of the guest's operating systems. Such approach only detects and recovers from the failure of the underlying hypervisor and its physical server. Neither does it detect and recover from the failure of applications running inside the guest VMs. Thus, applications can fail while running within a guest VM, without the hypervisor based cluster taking any notice. In this case, the guest is still up, but it doesn't give service. This places a significant limitation on the achievable availability of virtualized systems, since failures are frequently due to operating system problems, and application crashes and hangs. Moreover, more complex critical business applications require operations on the application level to take advantage of certain built-in data replication technology. Without any visibility into the guest VM, it is not possible to invoke these operations and take advantage of the built in features.
In addition, users who wish to take advantage of both the HA features at the hypervisor level and within the guest VM typically must become expert in, and must install, both hypervisor level and application level HA managers. At the same time, such users must ensure that policies that express relationship between resources and ensure that, for example, filesystems are mounted where the application is started or the receiver of a data replication pair is started on another physical system as the sender, are maintained by both the hypervisor level and application level HA systems. However, this level of complexity management is generally beyond the capabilities of most users.