1. Field of the Invention
The present invention relates to a data processing apparatus and a data processing method which process same data in parallel.
2. Description of the Related Art
One of computer systems which perform data processing is a fault-tolerant computer system which has a redundant architecture designed using existing components as disclosed in, for example, pages 5 to 7 and FIG. 1 of Unexamined Japanese Patent Application KOKAI Publication No. H9-128349. This computer system employs a lock-step system.
In the lock-step system, first, a plurality of processors with a redundant architecture synchronously process same data in parallel. Then, the outputs from the processors are compared with one another to detect an error if any and the error is corrected.
Recent computer systems are employing a fast serial link system, such as the PCI-Express, Hyper-Transport (registered trademark) or InfiniBand (registered trademark), which can ensure fast data transmission and reception, to connect processors to I/O (Input/Output) systems.
While the use of such a fast data transmission and reception system in the computer system with the redundant architecture indeed makes the data transmission and reception speed faster, the structure makes it harder to guarantee the identity of data to be processed by plural processors and makes it easier to cause communication errors.
When detecting communication errors, for example, individual interface sections which intervene data transmission and reception between processors and I/O systems, the interface sections request resending of data at their own timings different from one another. Accordingly, the timing and order of processes to be executed by the individual processors deviate, so that the lock-step system cannot be maintained. This makes it difficult for plural processors to synchronously process same data.
When only some of plural interface sections have detected communication errors, for example, the interface sections that have detected the communication errors do not share the error information with the other interface sections. While those interface sections which have detected the communication errors request resending of data, therefore, those interface sections which have not detected them receive data as it is. In this case, the timing of the subsequent processing of the received pieces of data, the same though they are, deviates, so that the identity of data in parallel processing cannot be guaranteed.
Further, such a computer system is likely to suffer a data delay originated from the lengths of communication lines. When the data delay shifts the timing of processing by plural processors, the plural processors have a difficulty in synchronously processing same data as in the case mentioned previously. This requires that the equal line lengths should be provided strictly, thus placing considerable restrictions on the degree of freedom on the structure of the casing of the system, the design of the board, and the structure of the board.