Virtualization software, commonly referred to as a hypervisor, enables multiple virtual machines to be executed on a host hardware platform. The hypervisor manages the allocation of computing resources to each virtual machine on the host. Additionally, the hypervisor may receive a heartbeat data signal from each virtual machine as an indication of the virtual machine operating normally. If the hypervisor has not received the heartbeat for a period of time, it may be implied that the virtual machine is having a problem. After the period of time has expired, the hypervisor will reset the virtual machine in an effort to return the virtual machine to normal operation. Failure to receive a heartbeat, however, is not always due to a problem that requires the hypervisor to reset the virtual machine. For example, the problem may be due to a problem with or delay caused by a virtual machine software agent responsible for transmitting the heartbeat rather than a problem with the virtual machine's operating system. The period of time that elapses without a heartbeat may therefore include a delay to address this uncertainty. Such a delay allows the virtual machine heartbeat an opportunity to recover, e.g., if the problem that has prevented transmission of the heartbeat does not require a reset of the virtual machine. Allowing the heartbeat an opportunity to recover prevents unnecessary resets of the virtual machine. This delay, however, also slows down recovery when the problem does require a reset of the virtual machine.
A virtual machine may also be reset (or restarted) in response to a user command through the virtual machine's operating system. If the virtual machine is overloaded or the operating system is failing to function properly, however, the user's command may fail or the reset/restart may take longer than desired.
Furthermore, an external management server may be used to provision, update, patch, and secure the virtual machines across multiple hosts. The external management server may transmit a request to the hypervisor to initiate a reset and/or move a virtual machine, e.g., for load balancing or in response to input from an administrator or user. Using an external management server to reset the virtual machine, however, adds a component to the critical path for the reset. Adding a component to the critical path increases the chance for delay, errors, and problems arising from lost connections with the management server.