An information processing apparatus generally includes a central processing unit (hereinafter referred to as “CPU”). Some CPU includes a function for transiting to a system management mode (hereinafter referred to as “SMM”) that is one of operation modes of the CPU by a kind of interrupt called system management interrupt (hereinafter referred to as “SMI”). As an example of a CPU that transits to the SMM, a CPU of the x86 architecture of Intel or a like architecture is available.
If an SMI is received, then the CPU transits to the SMM mode. In the SMM mode, the CPU executes an SMI handler. An SMI handler is a program for processing the SMI in a system management random access memory (hereinafter referred to as “SMRAM”) space that is an independent address space in a memory space and cannot be accessed from any other operation mode.
If the processing of the SMI is completed, then the CPU restores the mode before the transition to the SMM.
The PCI Express (PCIe) is one standard for an I/O serial interface.
In a connection configuration of the PCIe, devices (ports) of the PCIe are connected to each other through a link of the PCIe. In detail, a root port of the PCIe and functions as a start point of the connection configuration, a PCIe switch for routing a packet between the PCIe ports and an endpoint such as a PCIe card positioned at a terminal end, which all exist in a chip set, are connected to each other through the link of the PCIe.
FIG. 12 schematically depicts an example of a connection of the PCIe. In the connection of the PCIe, as depicted in FIG. 12, an upstream device 101 near to the root port (or the CPU) and a downstream device 102 are connected to each other by a physical transmission path (hereinafter referred to as “transmission path”) 103 such as a cable, a connector, a wiring line and so forth.
To specify, where an error occurs in such a PCIe connection as described above and the link of the PCIe is disconnected (hereinafter referred to as “link down”), a location (suspect location) at which an error occurs is referred to as “fault location process”.
Here, as depicted in FIG. 12, the suspect location is one of three locations including the upstream device 101, downstream device 102 and transmission path 103.
In the fault location process, the contents of status registers 104 and 105 provided in the devices 104 and 105, respectively, are analyzed to specify a suspect location when a fault occurs. However, if a link down occurs, then the status register 105 of the device 102 on the downstream side with respect to the link down location cannot be acquired.
Thus, specification of a suspect location of the link down is carried out, for example, by mounting an apparatus for exclusive use on an information processing apparatus and then causing the fault to be reproduced. Therefore, at a site at which an information processing apparatus operates, a suspect location of the link down cannot be specified immediately.
Here, since the downstream device 102 is a PCIe device or a PCIe card connected through a cable, the downstream device 102 is in most cases exchanged readily in comparison with the upstream device 101. Since the downstream device 102 can be readily exchanged in this manner, the conventional fault location process takes a countermeasure assuming that the suspect location upon occurrence of the link down is the downstream device. Then, a procedure manual or the like is used to let the technical staff know well that there is the possibility even that the suspect location may be the upstream device or the transmission path different from the downstream device.
However, since some of the technical staff who attempts to cope with the fault at the site at which the information processing apparatus is operating does not necessarily have sufficient expertize skills or the working time is limited, the possibility of some other suspect location may not be able to be examined sufficiently.
Where the estimated suspect location is mistaken in such a situation as described above, the fault member incorrectly determined to be “normal” is not exchanged. Therefore, after the error is dealt with, an error may occur with a high degree of possibility. Further, even if an investigation of the causes is attempted with regard to a normal article which has been incorrectly determined as a fault article, since the article originally is normal, it is difficult to specify a cause of the error.
In this manner, the conventional technology has a subject that very much time and labor are required and besides expertize skills are required in order to specify and deal with a suspect location when a link down occurs.