Various data transfer apparatuses and data transfer methods have been proposed to realize highly reliable and highly efficient communications in transmission paths where errors such as inversion of any bit 0 or 1 in data transferred from a transmission end to a reception end may occur due to a soft error. Here, the soft error is defined as an error caused by an alpha ray from outer space (helium nuclear ray). A typical example of data transfer apparatuses is one with a constitution that allows for transmission of transmission data that is added with error testing information such as CRC (Cyclic Redundancy Check), which is an error detectable code, from a transmission end to a reception end.
The following disclosed data transfer method was made to realize the objects of reducing data transfer time that increases when the error testing information is transmitted and of creating a constitution that does not require an error testing circuit in the reception end. A transmission end stores communication data at the same time as transmitting the data to a reception end, and the reception end receives the data transmitted from the transmission end and returns the data back to transmission end. The transmission end compares the returned data with the stored data, and if the data do not match, the data is retransmitted.
The following disclosed data transfer method was made to prevent increase in time for completing transmission that would occur under such a circumstance that after transmitting data, the transmission end cannot transmit the next data until an error detection result is returned from the reception end. The data transmission method is such that data added with a correction code is transmitted, and if data received in the reception end has an error, the received data is corrected on the basis of the correction code. The corrected data and the correction code are returned to the transmission end, and the transmission end compares the originally transmitted data with the corrected data. If an error is detected, the transmission data is retransmitted.
Various data transfer apparatuses and methods as described above are suggested. Detailed explanation of a common data transfer information processing apparatus is provided below.
FIG. 1 illustrates a configuration of an information processing apparatus. The information processing apparatus have a processor 101, a crossbar/system control 102, an I/O device 103, memory 104, and a system service processor 105. The processor 101, the crossbar/system control 102, and the I/O device receive an input of control signals from the system service processor 105. Data transfer is mainly performed between the processor 101 and the crossbar/system control 102, and between the crossbar/system control 102 and the I/O device 103. Details of the data transfer sections are enlarged in FIG. 2.
Each of the processor 101, the crossbar/system control 102 and the I/O device 103 has a transmission unit 111 and a reception unit 112 for data transfer.
The processor 101 includes a processor core 113, which serves as a processing unit, and also includes a memory controller 114 for controlling the memory 104. There is another possible constitution such that the processor 101 does not include the memory controller 114 but the processor 101 accesses the memory via external memory controller.
The crossbar/system control 102 includes a switch 116 for dynamically switches a path to transfer data. The I/O device 103 includes an I/O core 117 for control processing of the I/O device.
The data transferred between the processor 101 and the crossbar/system control 102 and between the crossbar/system control 102 and the I/O device 103 has a data width of at least several tens of bits (e.g. 80 bits) or larger, and because the transmission path for transferring such data has multiple-bit parallel wiring, data transfer may be varied and errors such as bit inversion may be caused in data received in the reception end. The errors such as bit inversion in data may be sometimes caused by electrical noise or a reflection wave, which is generated in the transmission path at the time of transmitting data through the transmission path. A bit that is detected as an error in data received at the reception end is referred to as an error bit.
FIG. 3 depicts further detailed configuration of the sections of the transmission unit 111 and the reception unit 112 in FIG. 2. FIG. 3 illustrates a pair of the transmission unit 111 and the reception unit 112 provided in each of the processor 101, the crossbar-system control 102, and the I/O device 103, and performs error-detectable data transfer by using CRC. Note that although CRC is used in this example, other error testing information such as parity can be also used.
The transmission unit 111 has a packet generation unit 121 for generating packets from transmission data consisting of commands and data generated in the upper layer such as the processor core 113, the switch 116 and the I/O core 117, a transmission control unit 122 for executing transmission control of the generated packets, and a transmission circuit 123 for driving the transmission path. The packet generation unit 121 has a CRC adding circuit 125 for adding error testing information such as CRC to transmission data. The transmission control unit 122 has a transmission packet recording unit 126 for recording all transmitted packets in preparation of receiving a request for retransmission from the reception end.
The reception unit 112 has a reception circuit 128 for converting the received voltage waveform into a digital signal (0 or 1) and a reception control unit 127 for checking the received packets. The reception control unit 127 has a packet decode unit 129 for decoding the received packets and a CRC testing circuit 130 for CRC testing of the received packets. When packets received in the packet decode unit 129 are a retransmission request as a result of decoding of the packets, the retransmission control unit 124 is notified of the request, and the retransmission control unit 124 takes a control to retransmit the transmitted packets from the transmission packet recording unit 126. When a fact that received packets has an error is found as a result of the test in the CRC testing circuit 130, the retransmission control unit 124 transmits a retransmission request.
Operations for data transmission/reception illustrated in FIG. 3 are explained below.
The transmission unit 111 at the transmission end generates transmission packets by adding CRC to the transmission data in the packet generation unit 121.
The reception unit 112 at the reception end decodes the received packets in the packet decode unit 129, and conducts CRC testing in the CRC testing circuit 130. The retransmission control unit 124, when notified of the test result of error packets from the CRC testing circuit 130, transmits a retransmission request. If correct packets cannot be received even after making a predetermined number of the retransmission requests, the retransmission control unit 124 stops the transmission of the retransmission requests, and notifies the upper layer such as the processor core 113, the switch 116, and the I/O core 117.
The reception unit 112 at the transmission end decodes the received packets in the packet decode unit 129, and conducts CRC testing in the CRC testing circuit 130. The retransmission control unit 124, when notified of the reception of a retransmission request from the packet decode unit 129, transmits transmitted packets recorded in the transmission packet recording unit 126. Afterwards, upon receiving an acknowledgement to inform the normal reception of the packets from the reception end, the transmission end resumes the normal packet transmission operations.
CRC does not have an error correction function and cannot specify an error bit. For that reason, in common communication systems using CRC, a reception end discards data when an error is detected and requests the transmission end retransmission of the transmitted data. In the past, it was common technology that when an error is detected in received data, i.e. packets, the reception end discards the error packets and requests retransmission of the data.
Errors (mistakes) that occurs in packets transmitted in a transmission path are generated when the reception circuit at the reception end fails to receive correct packets due to a subtle variation in setting values of peak voltages in voltage waveforms output from an analog circuit of the transmission circuit or a subtle difference in timings of output of the voltage waveforms. In this specification, conditions that cause such errors are referred to as failure conditions.
In the failure conditions, i.e. when data transfer temporarily fails, the conventional data transfer method, upon detecting an error in data, discarded the data and requested retransmission of the data. As a result, an error bits that is a failure location in the packets could not be specified. When data transfer operations temporarily fail, it is necessary to find whether the failure is permanent fault or not after restarting the system. However, the failure analysis was rendered difficult because failure conditions may not remain the same after restarting the system.
In view of the above, persons skilled in the art would appreciate the following. When the failure is a temporary failure such as failure in transmission due to permanent fault as a result of fluctuation in the adjustment state of an analog circuit in a transmission circuit or in a reception circuit, from the viewpoint of the communication recovery, the communication is expected to be recovered by readjustment of circuit by reinitializing (restarting) the transmission circuit and the reception circuit. Meanwhile, from the view point of identifying the failed part, there is a problem that the failure analysis becomes difficult since the failure state cannot be reproduced by the reinitialization of the circuit.    [Patent Document 1]            Japanese Laid-open Patent Publication No. 64-834            [Patent Document 2]            Japanese Laid-open Patent Publication No. 63-318838        