A computer system generally comprises a number of input-output (I/O) controllers such as a LAN controller and a SCSI controller in addition to a control section (hereinafter referred to as a processor unit) including a processor that takes a central part in the system and a memory. The processor unit is connected to other units via system buses. There have been disclosed some techniques for fault detection concerned with the system bus in Japanese Patent Applications laid open No. HEI4-8147, laid open No. HEI7-168727, and laid open No. HEI8-263328.
A SCSI port is a standard interface for connecting the peripheral equipment such as a HDD (Hard Disk Drive) with the processor unit. With the SCSI port, a SCSI controller is used as an I/O controller to be a host for communicating with a magnetic disk and the like. Techniques involved with the SCSI controller have been disclosed, for example, in Japanese Patent Applications laid open No. HEI11-203239 and laid open No. HEI11-110138.
Generally, in a conventional system, I/O controllers like the SCSI controller are installed in the system when the operation of the processor unit starts. At a restart of the processor unit or when executing an instruction from a system maintainer to install an I/O controller, only a part of memory area related to the operation of the I/O controller is used and installation processing is simply carried out during the process of the installation. After completion of the installation, necessary parts of memory area are selectively used for executing respective instructions each time when I/O access occurs in operation.
In the following, a description will be given of problems in the above-mentioned conventional techniques and systems.
The first problem is that an I/O bus access fault which occurs while using an I/O controller has an impact on the whole processor system, thus causing a system failure. This is because a fault cannot be located when the fault occurs in the I/O bus access from the I/O controller to the memory.
The second problem is that the I/O controller which has caused the failure can be reinstalled when restarting the processor. This is because normal operation is performed at the stage of installation processing since only sectional I/O bus accesses may occur when installing the I/O controller.
The third problem is that in the case where an I/O controller having slave/master relationships with plural devices (slave devices), for example, the SCSI controller and disk storage units are installed and one of the slave devices has a failure, the slave device with the failure cannot be specified.
Besides, in a system adopting a disk array, etc., there is a case where an additional disk storage unit is installed in the active system in which the SCSI controller and a disk storage unit #A have been already installed. When a SCSI controller failure is detected on such occasion and failure recovery is performed for the SCSI controller while the disk storage unit #A is in use, accessing to the disk storage unit #A is interrupted, which affects a software or program running on the system.