There are a variety of ways to achieve fault tolerant computing. Specifically, fault tolerant hardware and software may be used either alone or together. As an example, it is possible to connect two (or more) computers, such that one computer, the active computer or host, actively makes calculations while the other computer (or computers) is idle or on standby in case the active computer, or hardware or software component thereon, experiences some type of failure. In these systems, the information about the state of the active computer must be saved periodically to the standby computer so that the standby computer can substantially take over at the point in the calculations where the active computer experienced a failure. This method can be extended to the modern day practice of using a virtualized environment as part of a cloud or other computing system.
Virtualization is used in many fields to reduce the number of servers or other resources needed for a particular project or organization. Present day virtual machine computer systems utilize virtual machines (VM) operating as guests within a physical host computer. Each virtual machine includes its own virtual operating system and operates under the control of a managing operating system or hypervisor executing on the host physical machine. Each virtual machine executes one or more applications and accesses physical data storage and computer networks as required by the applications. In addition, each virtual machine may in turn act as the host computer system for another virtual machine.
Multiple virtual machines may be configured as a group to execute one or more of the same programs. Typically, one virtual machine in the group is the primary or active virtual machine and the remaining virtual machines are the secondary or standby virtual machines. If something goes wrong with the primary virtual machine, one of the secondary virtual machines can take over and assume its role in the fault tolerant computing system. This redundancy allows the group of virtual machines to operate as a fault tolerant computing system. The primary virtual machine executes applications, receives and sends network data, and reads and writes to data storage while performing automated or user initiated tasks or interactions. The secondary virtual machines have the same capabilities as the primary virtual machine, but do not take over the relevant tasks and activities until the primary virtual machine fails or is affected by an error.
For such a collection of virtual machines to function as a fault tolerant system, the operating state, memory and data storage contents of a secondary virtual machine must be equivalent to the operating state, memory and data storage contents of the primary virtual machine. If this condition is met, the secondary virtual machine may take over for the primary virtual machine without a loss of any data. To assure that the state of the secondary machine and its memory is equivalent to the state of the primary machine and its memory, it is necessary for the primary virtual machine periodically to transfer its state and memory contents, or at least changes to the memory contents since the last update, to the secondary virtual machine.
The periodic transfer of data to maintain synchrony between the states of the virtual machines is termed checkpointing. A checkpoint defines a point in time when the data is to be transferred. During a checkpoint, the processing on the primary virtual machine is paused, so that the final state of the virtual machine and associated memory is not changed during the checkpoint interval. Once the relevant data is transferred, both the primary and secondary virtual machines are in the same state. The primary virtual machine is then resumed and continues to run the application until the next checkpoint, when the process repeats.
Checkpoints can either be determined by the passage of a fixed amount of elapsed time from the last checkpoint or by the occurrence of some event, such as the number of memory accesses (termed dirty pages); the occurrence of a network event (such as network acknowledgement output from the primary virtual machine); or the occurrence of excessive buffering on the secondary virtual machine (as compared to available memory) during the execution of the application. Elapsed time checkpointing is considered fixed checkpointing, while event based checkpointing is considered dynamic or variable-rate checkpointing.
Checkpointing is a resource intensive operation that has different operating periods during which the demand for processing cycles increases, such that the demand is uneven between some of the periods. These processor demanding stages can result in increased network latency for out-bound traffic from the VM or other system being checkpointed. A need therefor exists for ways to reduce the cost of checkpoint processing during certain demanding periods and in turn reduce network latency of out-bound traffic.
Embodiments of the invention address this need and others.