There are a variety of ways to achieve fault tolerant computing. Specifically, fault tolerant hardware and software may be used either alone or together. As an example, it is possible to connect two (or more) computers, such that one computer, the active computer or host, actively makes calculations while the other computer (or computers) is idle or on standby in case the active computer, or hardware or software component thereon, experiences some type of failure. In these systems, the information about the state of the active computer must be saved periodically to the standby computer so that the standby computer can substantially take over at the point in the calculations where the active computer experienced a failure. This example can be extended to the modern day practice of using a virtualized environment as part of a cloud or other computing system.
Virtualization is used in many fields to reduce the number of servers or other resources needed for a particular project or organization. Present day virtual machine computer systems utilize virtual machines (VM) operating as guests within a physical host computer. Each virtual machine includes its own virtual operating system and operates under the control of a managing operating system or hypervisor executing on the host physical machine. Each virtual machine executes one or more applications and accesses physical data storage and computer networks as required by the applications. In addition, each virtual machine may in turn act as the host computer system for another virtual machine.
Multiple virtual machines may be configured as a group to execute one or more of the same programs. Typically, one virtual machine in the group is the primary or active virtual machine, and the remaining virtual machines are the secondary or standby virtual machines. If something goes wrong with the primary virtual machine, one of the secondary virtual machines can take over and assume its role in the fault tolerant computing system. This redundancy allows the group of virtual machines to operate as a fault tolerant computing system. The primary virtual machine executes applications, receives and sends network data, and reads and writes to data storage while performing automated or user initiated tasks or interactions. The secondary virtual machines have the same capabilities as the primary virtual machine, but do not take over the relevant tasks and activities until the primary virtual machine fails or is affected by an error.
For such a collection of virtual machines to function as a fault tolerant system, the operating state, memory and data storage contents of a secondary virtual machine should be equivalent to the final operating state, memory and data storage contents of the primary virtual machine. If this condition is met, the secondary virtual machine may take over for the primary virtual machine without a loss of any data. To assure that the state of the secondary machine and its memory is equivalent to the state of the primary machine and its memory, it is necessary for the primary virtual machine periodically to transfer its state and memory contents to the secondary virtual machine.
The periodic transfer of data to maintain synchrony between the states of the virtual machines is termed checkpointing. A checkpoint defines a point in time when the data is to be transferred. During a checkpoint, the processing on the primary virtual machine is paused, so that the final state of the virtual machine and associated memory is not changed during the checkpoint interval. Once the relevant data is transferred, both the primary and secondary virtual machines are in the same state. The primary virtual machine is resumed at the earliest possible point in the process and continues to run the application until the next checkpoint, when the process repeats.
Checkpoints can be determined by either the passage of an upper limit amount of elapsed time from the last checkpoint or sooner by the occurrence of some event, such as: the number of memory accesses (termed dirty pages); the occurrence of a network event (such as network acknowledgement that is output from the primary virtual machine); or the occurrence of excessive buffering on the secondary virtual machine (as compared to available memory) during the execution of the application. An idle primary virtual machine, for instance, would rely on the upper-limit elapsed timer to perform a periodic checkpoint, while a busy machine would likely trigger one of the mentioned events. This event-based approach is considered dynamic or variable-rate checkpointing.
Outbound network traffic can cause an immediate checkpoint cycle to ensure lower-latency exchanges between the primary virtual machine and the computer on the network receiving the transmission from the virtual machine. This is desirable for file-level operations such as folder enumeration, file deletion, attribute manipulation, and even single-threaded transaction exchanges. Under these types of latency-sensitive exchanges, it is desirable to have a rapid and responsive checkpoint rate. Latency-sensitive exchanges such as certain client requests and server responses benefit from a very responsive checkpoint mechanism.
However, excessive checkpointing can lead to performance degradation of the primary virtual machine. In turn, this can result in decreased levels of network throughput, which can compromise the utility of a fault tolerant system. This is particularly likely in an event-based approach when streaming network loads are present.
Adding a fixed minimum delay to each checkpoint cycle is one effective way to improve throughput and, as the delay is increased further, improvement can be obtained under certain streaming-load conditions. However, this type of delay causes harm to the latency-sensitive loads mentioned earlier. In addition, the right delay for one streaming load may not work well for other streaming loads.
Therefore, a need exists for ways to vary the checkpointing of a system dynamically while meeting the requirements of the relevant applications and system users.
Embodiments of the invention address this need and others.