1. Technical Field
The present disclosure relates to a fault tolerant system using virtual machines and a method for performing a fault tolerant using virtual machines.
2. Related Art
The fault tolerant system is a system that is able to continue operation without crashing the overall system even in a situation where a defect is occurred in a part of a system configuration, and is particularly applied to a system that is required to have a high availability and uninterrupted system availability. For example, a server computer employing a fault tolerant system is capable to output correct data without having a communication error in response to a network access from a client application of an external device even in a situation where a hardware failure is occurred.
As disclosed in JP-A-2009-080695, there has been known a technique that realizes the fault tolerant system in virtual machines running on two computers that communicates each another. The fault tolerant system using the virtual machines synchronizes execution states of those two virtual machines to execute the same operation. When a failure occurs in one of the computers, the virtual machine that operates in the other computer takes over processing, to thereby continuously provide a service of the system without any interruption.
In the fault tolerant system using virtual machines, one of the virtual machines is set as a primary virtual machine, and the other of the virtual machines is set as a secondary virtual machine. The primary virtual machine is configured to execute the same operation ahead of the secondary virtual machine and to take over the control of input/output with respect to external devices.
In general, when two computers that execute the same program receive an input from an external device precisely at the same timing, the two computers execute the same operation, and output the same data. Therefore, when an interrupt is generated based on an external input in the fault tolerant system using the virtual machines, the fault tolerant system allows the primary virtual machine to transmit a timing at which the interrupt is generated to the secondary virtual machine as synchronization information. Then, the secondary virtual machine that is running with a delay generates a virtual interrupt at the same timing as the timing notified by the synchronization information, whereby the primary virtual machine and the secondary virtual machine execute the same operation in synchronization with each other.
FIG. 6 is a block diagram illustrating a configuration of a conventional fault tolerant system using the virtual machines. As illustrated in FIG. 6, a fault tolerant system 60 includes a primary machine 600 and a secondary machine 700 which are connected to each other on a network.
In the primary machine 600, a primary hypervisor 620 is running on a primary hardware 610 which is a physical computer environment, and a primary virtual machine 630 is configured. In the primary virtual machine 630, a primary guest OS (Operating System) 640 is running, and an application 650 is executed on the primary guest OS 640.
The primary hardware 610 is equipped with a variety of devices such as a CPU (Central Processing Unit), a memory, a network interface card (NIC), and a storage.
The primary virtual machine 630 is allocated with a part of a hardware resource of the primary hardware 610, and takes over the control of input/output with respect to the external device in a virtual computer environment. The primary virtual machine 630 is managed by the primary hypervisor 620.
Likewise, in the secondary machine 700, a secondary hypervisor 720 is running on a secondary hardware 710 which is a physical computer environment, and a secondary virtual machine 730 is configured. In the secondary virtual machine 730, a secondary guest OS 740 is running, and an application 750 is executed on the primary guest OS 740.
The secondary hardware 710 is equipped with a variety of devices such as a CPU, a memory, a network interface card (NIC), and a storage.
The secondary virtual machine 730 is allocated with a part of a hardware resource of the secondary hardware 710, and operates in synchronization with the primary virtual machine 630 in a virtual computer environment. The secondary virtual machine 730 is managed by the secondary hypervisor 720.
In the conventional fault tolerant system 60, the execution states of the primary virtual machine 630 and the secondary virtual machine 730 are synchronized with each other in the following procedure.
Upon receiving an external interrupt from the primary hardware 610, the primary hypervisor 620 inputs the external interrupt to the primary virtual machine 630.
Then, the primary virtual machine 630 inputs a virtual interrupt to the primary guest OS 640. Now, an input of the virtual interrupt from the primary virtual machine 630 to the primary guest OS 640 will be described.
When a virtual machine context switching event such as an external interrupt, a privileged instruction, or an exception occurs during processing of the primary guest OS 640, the processing of the primary guest OS 640 is suspended, a guest OS context is switched to a virtual machine context, and the processing is transitioned to the primary virtual machine 630.
If the primary virtual machine 630 needs to input the virtual interrupt on the primary guest OS 640 according to various events at its timing, the primary virtual machine 630 configures the virtual interrupt. When the virtual interrupt is configured, the primary virtual machine 630 terminates the processing, and when the processing is returned to the primary guest OS 640 suspended at the time of generating the event, the virtual interrupt is input to the primary guest OS 640.
When the primary virtual machine 630 inputs the virtual interrupt to the primary guest OS 640, the primary virtual machine 630 transmits the synchronization information to the secondary virtual machine 730. The synchronization information includes identification information on the virtual interrupt, and synchronization timing information for inputting the virtual interrupt.
The synchronization timing information is information for inputting the virtual interrupt to the secondary guest OS 740 at the same timing as that of the virtual interrupt input to the primary guest OS 640, and includes information indicating an execution suspension position and the number of execution instructions specific to the CPU.
As the execution suspension position, a value of a program counter, which indicating an address of an instruction executed when the virtual interrupt is input, may be used. The number of execution instructions may be measured by a CPU execution instruction number counter of a performance counter provided in the CPU.
In measuring the number of execution instructions, when the primary virtual machine 630 inputs the virtual interrupt to the primary guest OS 640, the CPU execution instruction number counter is cleared to zero, and the CPU execution instruction number counter is enabled before restarting the execution of the primary guest OS 640. As a result, the number of instructions executed by the primary guest OS 640 since a previous virtual interrupt input is counted.
When only the execution suspension position is used as the synchronization timing information, a timing of the virtual interrupt input cannot be specified in a case where an instruction indicative of the execution suspension position is included in a loop processing or in a conditional branch destination, because the instruction is executed every time when the loop or the conditional branch is processed.
Also, when only the number of execution instructions is used as the synchronization timing information, the secondary guest OS 740 cannot be suspended with the designated number of execution instructions due to a speed-up technique such as a pipeline processing, and unavoidably be suspended beyond the number of execution instructions. Accordingly, the virtual interrupt cannot be input to the secondary guest OS 740 at the same timing as that of the primary guest OS 640.
Under the circumstance, the execution suspension position and the number of execution instructions are combined together as the synchronization timing information, and the number of execution instructions is confirmed every time the instruction indicated by the execution suspension position in the secondary guest OS 740 is processed. As a result, the secondary guest OS 740 is suspended at the same timing as that when the virtual interrupt is input in the primary guest OS 640.
Therefore, when the primary virtual machine 630 inputs the virtual interrupt to the primary guest OS 640 after suspending the execution of the primary guest OS 640, the primary virtual machine 630 acquires a value of a program counter at the time of suspending the execution by the primary guest OS 640, and a value of the CPU execution instruction number counter to generate the synchronization timing information. Then, the primary virtual machine 630 transmits the identification information on the virtual interrupt and the synchronization timing information to the secondary virtual machine 730 as the synchronization information.
The secondary virtual machine 730 that received the synchronization information suspends the execution of the secondary guest OS 740 according to the synchronization timing information. The operation of the secondary virtual machine 730 in this situation will be described with reference to a flowchart of FIG. 7.
A break instruction is embedded in a program position designated by the execution suspension position of the synchronization timing information (S401), and the secondary guest OS 740 is restarted (S402). Then, when the secondary guest OS 740 stops (yes in S403), the CPU execution instruction number counter is confirmed, and if the counter value matches the designated number of execution instructions (yes in S404), the secondary guest OS 740 is suspended at the stop position (S405). If the counter value does not match the designated number of execution instructions (no in S404), the secondary guest OS 740 is restarted (S402) and repeats confirming the number of execution instructions.
When the secondary virtual machine 730 suspends the secondary guest OS 740, the secondary virtual machine 730 configures the virtual interrupt according to the virtual interrupt identification information of the synchronization information (S406), and restarts the secondary guest OS 740 (S407). As a result, the virtual interrupt is input to the secondary guest OS 740 at the same timing as that of the primary guest OS 640, and the execution states of the primary virtual machine 630 and the secondary virtual machine 730 are synchronized with each other.
When a hardware failure occurs in either one of the primary machine 600 and the secondary machine 700, the synchronization of the execution states is disturbed. When the synchronization of the execution states is disturbed, a value of output data becomes different between the primary virtual machine 630 and the secondary virtual machine 730. Under the circumstance, the fault tolerant system 60 checks an output of the primary virtual machine 630 against an output of the secondary virtual machine 730, and if the values of the output data are different from each other, it is determined that a hardware failure is occurred.
For the purpose of executing this failure determination processing, the secondary virtual machine 730 is equipped with an output data checking unit 731, which collects the output data of the primary virtual machine 630, and checks the collected output data against the output data of the secondary virtual machine 730.
As described above, when the primary virtual machine 630 inputs the virtual interrupt to the primary guest OS 640, the primary guest OS 640 should be suspended at a position where the secondary guest OS 740 is also be suspended at the same position.
However, when the primary guest OS 640 is suspended due to a virtual machine context switching event such as an external interrupt, and the control is transferred to the primary virtual machine 630, the suspension position of the primary guest OS 640 may be in the critical section. In the present specification, the critical section is a section in a program that crashes a process when a plurality of processing is executed on a single resource at the same timing, and is a section where an exclusive control such as break instruction disablement is performed by the program.
In this case, when the virtual interrupt is input to the primary guest OS 640 at the suspension position, which is within the critical section, even if the break instruction is embedded in the same position, the secondary guest OS 740 cannot be suspended at the same position, and the secondary guest OS 740 is suspended after processing the critical section. In this case, the virtual interrupt position is shifted, and the execution states become out of synchronization.