There is a method for managing a computer by a remote manager which is an input-output device for remote management and which is connected to the computer through an I/O bus such as a PCI bus, or the like. The remote manager has communication input-output devices such as a network adapter, a modem, etc. The remote manager is connected to another computer by LAN, telephone line, or the like, so that the remote manager manages the first-mentioned computer from the other computer in a remote place.
The remote manager acquires operating information of the computer via an I/O bus or a private bus for transferring management information of the computer to be managed. The remote manager has registers and memories so that a CPU in the computer to be managed can make access to the registers and memories via an I/O bus.
The remote manager may be configured as a computer (manager computer) having a CPU, a memory, and I/O devices including communication devices such as a network adapter and a modem as described in JP-A-9-50386, JP-A-5-257914 and JP-A-5-250284. In this case, the CPU on the manager computer can execute a management program independently of the computer to be managed, that is, the CPU can execute the management program regardless of the operating state of the computer to be managed. That is, the manager computer can execute the management program even before the start of an operating system (OS) of the computer or even in the case where the computer is halted due to a fault and is disabled (hung up) from accepting any operation from the outside.
When such a hang-up fault occurs in the computer to be managed, the manager in the background art connected to an I/O bus restarts the computer by a method such as resetting the CPU, cutting off a power supply to the computer to be managed, or the like. The restart is achieved by connecting the manager to the computer to be managed by a private signal line and by making the manager transmit a reset signal to the CPU of the computer to be managed via the signal line or by making the manager transmit an interruption to shift control to firmware on the computer to be managed. The private signal line is required because the I/O bus has no signal line to transmit an interrupt to force the execution of the OS to stop.
To carry out the restarting method, another signal line than the I/O bus must be set up between the manager and the computer to be managed. Hence, there is a problem that the computer to be managed is limited to a computer which can be connected to the manager. That is, unless a combination of a computer to be managed and a manager can be connected to each other through a private line, the computer to be managed cannot be restarted from the manager side when a fault occurs in the computer.
In the background art, the restarting is performed on the basis of resetting of the CPU. Accordingly, there is no opportunity of interposition of the OS. In addition, the contents of the main memory in the computer to be managed are lost because of the restart of the OS. Hence, it becomes difficult to analyze a cause of a fault. There is also a problem that the fault cannot be analyzed when the fault is not reproducible.
On the other hand, a general-purpose I/O bus such as a PCI bus is configured so that an interrupt to force the OS to execute fault processing cannot be transmitted from the manager to the computer to be managed. In some case, however, such an I/O bus has a signal line to transfer additional information (such as parity bit) for guaranteeing the accuracy of address, command, data, etc. which are transferred via the I/O bus (PCI Hardware and Software Architecture Design, pp 172–174, Annabooks, 1994). If an I/O bus can transfer such additional information, the computer to be managed or an input/output device of the computer can verify the accuracy of data on the I/O bus in the data transfer via the I/O bus.
In addition, in the case where an I/O bus having the aforementioned function is used, there is provided an I/O bus controller which has a signal line to inform the CPU of a fault when an incorrect signal is detected on the basis of the additional information on the I/O bus (Microprocessor Report, pp 11–12, Vol. 12, Number 9, July, 1998).
With respect to the CPU in the computer to be managed, the CPU may be disabled from making access to a memory when a fault occurs in the bus. Hence, a situation that the CPU cannot operate may occur. In the case where the bus is locked as described above, it is impossible to restart the execution of the CPU even if an interrupt signal is transmitted to the CPU. This is because memory access is disabled by the bus fault so that an interrupt handler cannot be started.
As a measure against such a fault, there is a CPU which reinitializes only a bus without resetting the CPU itself and then internally generates an interrupt to shift control to the interrupt handler when a fault signal concerning the bus is detected (Microprocessor Report, pp 1, 6–10, Vol. 12, Number 9, July, 1998). With the CPU, the execution of the CPU can be restarted so that the fault processing by the OS can be started, even in the case where the bus is locked.
In a manager for a computer connected to an I/O bus in the background art, the computer is restarted as a whole by a method of resetting the CPU of the computer through another signal line than the I/O bus or by a method of resetting the CPU on the basis of firmware on the computer when a fault disabling the OS from executing the fault processing occurs in the computer. In these methods, however, there was a problem that the OS could not carry out the fault processing because the CPU was reset, so that it becomes impossible to acquire fault information.
Moreover, the manager in the background art required another signal line than the I/O bus or required a circuit or firmware provided on the computer to execute a process of resetting the CPU. In this method, there was a problem that the computer which was allowed to be connected to the manager was limited.
An object of the present invention is to provide a computer system in which a computer can acquire fault information even in the case where a fault disabling an OS from executing fault processing occurs in the computer.
Another object of the present invention is to provide a computer system in which a bus used by a computer to be managed can be initialized through an I/O bus.