The present invention relates to an input/output control device and method applied to a checkpoint rollback-type fault-resilient computer system.
The act of taking the system's consistent state periodically, returning the system's state to the consistent state in case of a failure, and then restarting the interrupted processing is known as backward recovery or rollback.
In general, the definition "the system's consistent state" includes the states of the memory and the CPU registers, and not the states of the I/O devices (e.g., interface cards). The reason is that the transition of the states of the CPU and the memory is reversible, whereas the transition of the states of the I/O devices is irreversible. Therefore, if an I/O element that executes the processing of I/O devices and a calculation element that executes calculations are integrated into a system, it will be difficult to match the state of one with that of the other once the matching between the two states has collapsed at the time of the occurrence of a fault.
For this reason, with a conventional system, it is a common practice to separate the I/O element from the calculation element to prevent the occurrence of faults in the calculation element from having an adverse effect on the I/O element. Since in the conventional system, the I/O element itself is not resilient to faults, the I/O element is made multiple to realize fault resilience. In this case, the OS running on the I/O element is a unique OS.
Now that, for example, UNIX has been used a virtually standard OS, remodeling the device driver and separating the calculation element from the I/O element become great demerits from the viewpoints of compatibility and cost performance. To avoid these demerits, it is desired to realize a system capable of backward recovery without remodeling the conventional device driver or separating the calculation element from the I/O element.