The present invention relates to an I/O control apparatus adapted to a computer system having check point recovery function.
In recent years, computer systems have considerably improved. With this improvement, reliability, such as coping with a fault, has become mandatory. One method implemented in a fault tolerant computer is a checkpoint recovery scheme.
According to a method for implementing a checkpoint recovery scheme, the internal state of a CPU, namely, the contents of the registers and the cache memory of a CPU are periodically saved in a main memory to acquire a checkpoint on the main memory. When data processing cannot continue due to a fault in the computer system, the main memory is restored to the state of the most recent checkpoint, and data processing is restarted using the internal state of the CPU stored in the main memory.
A method for restoring the main memory to the state of the checkpoint is as follows. In an update operation of a main memory, the address and data to be updated are stored in a memory state recovery unit. Upon occurrence of a fault in the computer system, the main memory is written back with the data previously stored in the memory state recovery unit.
In this checkpoint recovery scheme, upon occurrence of a fault in the computer system, the internal state of the main memory or CPU can be restored to the state of the most recent checkpoint by using the memory state recovery unit. An I/O device connected to the computer system, however, cannot be easily restored to the state of the most recent checkpoint.
This problem will be described below with reference to FIGS. 1 and 2.
As shown in FIG. 1, in this computer system, a CPU 51 requests a disk controller 52 to access a disk 53 to perform an I/O operation. FIG. 2 shows a timing diagram of the I/O processing of the computer system having the above arrangement.
As depicted in FIG. 2, registers of the disk controller 52 are set to read data from a predetermined position of the disk 53 at times T0 to T1 ((l) in FIG. 2), and the disk controller 52 is started at time T1 ((2) in FIG. 2). In this manner, the disk controller 52 and the disk 53 execute a read operation at times T1 to T2 ((3) in FIG. 2). The read data are transferred into the main memory 54 by DMA transfer from the disk controller 52.
The CPU 51 receives a completion interrupt from the disk controller 52 at time T2 ((4) in FIG. 2), thereby performing a completion interrupt processing to the disk controller 52 at times T2 to T3 ((5) and (6) in FIG. 2). Another post processing with respect to the read operation is performed at time T3 to T4 ((7) in FIG. 2).
The first problem in this case is that a checkpoint acquired at an arbitrary time is not always valid.
For example, assume that a checkpoint is acquired in the middle of setting the registers of the disk controller 52 (the setup sequence between times T0 and T1.)
In this case, upon occurrence of a fault of the computer, a latter part of the setup sequence is re-performed from the most recent checkpoint, namely only a part of the registers of the disk controller 52 are set again. For this reason, the disk controller 52 does not always operate desirably.
In consideration of the characteristics of the disk controller 52, not only at times T0 to T1 described above, but also at times T0 to T3, i.e., when the CPU 51 acquires a checkpoint during a setup sequence for an I/O operation such as a read/write operation, the disk controller 52 does not always operate desirably when a latter part of the setup sequence is re-performed from the checkpoint after a fault occurs in the system.
One method to solve this problem is that a checkpoint operation must not be performed during a setup sequence of an I/O device. However, in a computer system in which many I/O devices are incorporated, the CPU almost always performs setup sequence of an I/O operation. Therefore, it may lead to a considerable performance degradation to prevent a checkpoint operation during a setup sequence of an I/O device.
A second problem is as follows. Assume a fault occurs in the system during a DMA transfer from the disk controller 52 to the main memory 54. In this case, ongoing DMA transfer must be stopped before the main memory 54 is restored to the state of the most recent checkpoint.
In a conventional computer system, in order to stop ongoing DMA transfer, it is necessary to initialize (reset) the I/O device. Since the I/O device is set in an initial state by initializing the I/O device, a special process is required to restore the I/O device to the state of the most recent checkpoint.
As a scheme for solving the problem of I/O processing in the above checkpoint recovery scheme, the following two schemes are known.
The first scheme is disclosed in USP-4740969 "METHOD AND APPARATUS FOR RECOVERING FROM HARDWARE FAULTS". In normal data processing, the data of read/write operations of the registers of an I/O device, during an interrupt from the I/O device, are recorded in a log memory. When a register setup sequence is restarted from the most recent checkpoint after a fault occurs in the computer system, the read/write operations performed to the registers of the I/O device before the fault occurs are re-performed as follows. For a write operation, the data is discarded and not written to the registers of the I/O device. For a read operation, instead of reading out from the register of the I/O device, the data in the log memory is returned to the CPU. For an interrupt from the I/O device, the interrupt is generated and sent to the CPU at the same time as in the preceding execution.
This scheme requires a special interface circuit which is not provided to an ordinary computer system. Moreover, it is difficult to apply this scheme to a multi-processor system.
The second scheme is disclosed in SEQUOIA: A Fault-tolerant Tightly Coupled Multiprocessor for Transaction Processing, IEEE Computer, February 1988. In this scheme, data processing in computer system is divided into a data processing portion which can be performed by using only CPU and a main memory and an I/O processing portion which handles I/O devices. These portions are executed by different computers.
FIG. 3 shows the schematic arrangement of a computer system in which data processing in the computer system is divided into a portion, performed by only access to a main memory and a portion including access to the I/O device, and the former is executed by a computer 100 whose reliability is improved by the checkpoint recovery scheme, and the latter is executed by a computer 200 which does not use the checkpoint recovery scheme. In the logical interface between these portions, a request representing "read the designated amount of data at the designated position of the designated disk " is sent from the computer 100 to the computer 200. When the computer 200 actually has read the data, a termination code indicating whether the operation is normally completed or not and the data read from the disk are returned from the computer 200 to the computer 100.
To improve the reliability of the computer 200, the constituent elements of the computer 200 are duplicated. Namely, the computer 200 consists of computer main bodies 210a and 210b and I/O devices 220a and 220b. In a normal state, the request is simultaneously processed on both sides, and the execution results are compared with each other to check whether the execution results are identical. If a fault occurs on one side, the requested operation is continuously performed on the remaining side.
This scheme has the following disadvantage. That is, since at least two types of computers must be prepared, the computer system is large and costly.
The following idea would arise from the second scheme. That is, the computer 100 and the computer 200 may be implemented by one computer by using virtual computer technology. However, this idea does not work well because of the following reason.
The scheme disclosed in SEQUOIA is based on the following assumption. Since the independent computers 100 and 200 are used, even if the data processing of the computer 100 is restarted from a checkpoint due to occurrence of a fault within the computer 100, the I/O processing of computer 200 is not influenced by the fault.
However, if the computer 100 and the computer 200 were implemented on one computer by using virtual computer technology, the computer 100 and the computer 200 would be simultaneously influenced by a fault occurring in the base computer system.
As described above, a checkpoint recovery computer system needs a special treatment of the I/O processing portion. A method of arranging a special interface between the CPU and the I/O device, or a method of separately performing a calculating portion and an I/O processing portion on two independent computers are employed. Therefore, the cost is considerably increased.