1. Field
The present disclosure pertains to the field of computing and computer systems, and, more specifically, to the field of error detection in computer systems using virtual machine monitors.
2. Background
Some computer systems may be susceptible to processing errors during operation. For example, transient errors (“soft errors”) caused by exposure of a computer system to radiation or other electromagnetic fields may corrupt data being transmitted throughout the computer system, causing incorrect or undesirable computing results. For example, soft errors may result in incorrect data being passed between a software application running on a processor and the input/output (I/O) data stream generated by the software application within a computer system. In this example, soft errors may exist in the application software, the operating system, the system software, or the I/O data itself.
The problem of soft errors in computer systems has been addressed through techniques, such as redundant software execution, wherein a segment of software is processed two or more times, sometimes on different processing hardware, in order to produce a number of results that can be compared with each other to detect an error in the result. Redundant software processing, although somewhat effective at detecting soft errors in a computer system, can require extra computing resources, such as redundant hardware, to redundantly process the software.
Another technique used in some computer systems is to virtualize the hardware in software and redundantly process various code segments within redundant virtual versions of the hardware in order to detect soft errors. Redundant virtual hardware, or redundant “virtual machines” (RVMs), can provide a software representation of underlying processing hardware, such that software code can be redundantly processed on the RVMs in parallel.
FIG. 1 illustrates a redundant virtual machine environment, in which software segments, such as software threads, can be processed redundantly in order to detect soft errors in the software. In particular, FIG. 1 illustrates two virtual machines (VMs) representing the same processing hardware in which a software thread can be processed redundantly and in parallel. The results from the redundant copies of one or more operations in the software thread can be compared with each other in order to detect a soft error before or after the software thread has actually been committed to hardware context state.
However, in order to assure that software is being processed equivalently on both VMs, the execution path of the code through the VMs must be controlled (or managed) by a software module, such as the replication management layer (RML), to be the same. Furthermore, the RML may need to compare the outputs of the two VMs. Unfortunately, the RML, or equivalent software modules, can introduce additional processing overhead that can cause performance degradation in a computer system. Furthermore, the RML may itself contain soft errors and therefore be unreliable.