1. Field of the Invention
The present invention relates to an I/O bridge device that transmits commands between a memory controller, which executes processes with respect to a main storage device, and peripheral components, a response-reporting method, and a program.
2. Description of Related Art
Recently, peripheral component interconnect express (PCI Express, hereinafter PCIe) is increasingly being used as a high-speed serial interface technology. PCIe is a standard with features such as serial signal transmission at 2.5 Gbps, a non-branching signal connection path with no point-to-point bus structure, data communication using protocols, and PCI compatibility with software. PCIe has replaced PCI as the interface standard.
In addition to ordinary communication functions, PCIe includes expanded PCI functions for processing errors. In PCI, errors were classified into two types: parity errors and system errors. In contrast, in PCIe, errors are classified into three types: ‘recoverable errors’, ‘fatal unrecoverable errors’, and ‘non-fatal unrecoverable errors’. These error types are reported in a standardized error message to a route complex (I/O bridge device). The route complex is a device that includes one or a plurality of PCIe ports, and relays communications between PCIe-standard peripheral components and a CPU and a memory (main storage device).
When using the basic PCIe error-processing function, the only information that can be obtained is the error type. However, by using PCIe advanced error reporting, when an error occurs, detailed information such as the cause of the error can be stored, and a detailed error analysis can be carried out.
Thus the error-processing in PCIe is more advanced than in PCI.
PCIe defines three types of transactions to be handled: posted transactions, non-posted transactions, and completion transactions. A posted transaction is a transaction that does not require a response, such as a memory write transaction. A non-posted transaction is a transaction that requires a response, such as a memory read transaction. A completion transaction is a transaction in response to a non-posted transaction.
In the present application, the term ‘transaction’ denotes a protocol data unit (PDU) for transmitting data used in communications between PCIe standard peripheral components and the route complex.
Japanese Unexamined Patent Application, First Publication No. 2005-208972 (hereinafter abbreviated as Patent Document 1) discloses a method of continuing operations even when the route complex has stopped due to a failure.
According to Patent Document 1, when a peripheral component or a CPU transmits a non-posted transaction to the route complex, and the I/O bridge device stops before the completion transaction that is the response arrives at the peripheral component or the CPU, an error reply generate circuit provided at the route complex transmits a non-posted transaction indicating an error reply, instead of the route complex.
By means of the PCIe error-processing function described above, the route complex can ascertain that an error has occurred, and obtain detailed information, such as the cause of the failure, at that time. However, when the transaction transmitted by the CPU/peripheral component is a posted transaction, the device that is the transmission source of the transaction is unable to receive an error report.
That is, if the peripheral component issues a posted transaction, such as a memory write transaction, and an error occurs while processing that transaction, the peripheral component is unable to ascertain that the error occurred, and the operation continues.
Even if the route complex, after receiving an error report, obtains and analyzes the failure information, and reports the occurrence of the failure to the peripheral component that is the object, there will be a time lag from the time when the error occurred. Consequently, the peripheral component cannot execute a failure process for the transaction in which the error has occurred.
While such error-processing problems exist in PCIe, ordinarily, processes such as writing data from a PCIe device to a memory are performed using posted transactions in order to prioritize performance. Hence, the peripheral component does not receive a response indicating the process result, and the source that issued the transaction cannot ascertain the process status of that transaction.
Therefore, when an error occurs while processing a memory write transaction issued by a PCIe device, it is processed as a ‘fatal unrecoverable error’ with a danger of fatal failure such as a shutdown of the system.
In a comparatively small-scale computer system such as a personal computer, this sort of error process might not be problematic. However, in a large-scale computer system which, as far as is possible, is expected to continue operating (e.g. a system that is normally run on a main frame or a large-scale server), a system shutdown is catastrophic and must be avoided.
While the method disclosed in Patent Document 1 can obtain a response to non-posted transaction, it cannot obtain a response to a posted transaction.