1. Field of the Invention
The present invention relates to a pool I/O device operation confirmation method and a computer system, and more particularly, for example, to a fault management in a computer system capable of controlling an I/O device connection switching.
2. Background Art
Such a computer system dealing with mission-critical applications needs to provide services by continuously operating 24 hours a day, 365 days a year. Even if the system is terminated due to a fault of a device constituting the system, the computer system is required to restart the operation by replacement and reconfiguration of a faulty portion in as a short time as possible. However, a manual recovery process requiring device replacement and reconfiguration may take a long time, and further delay due to a human error may occur. For that reason, a method has been mainly employed in which a backup device or a pool device is provided for a device in which a fault is expected to occur, the pool device is introduced into the computer system in advance, and when a fault occurs in an operating device, the faulty portion is automatically replaced with the pool device.
A pool device connected to the computer system is usually not operating, and thus a conventional operation confirmation program cannot be applied. In such a situation, when the pool device is left as is for a long time without being removed from the computer system, a normal operation thereof cannot be guaranteed. When a fault occurs in a current device, the current device is replaced with a pool device, but if a fault occurs in the pool device, the computer system fails an automatic pool device switching process and thus restart takes a long time. To avoid such a problem, operation confirmation needs to be performed on the pool device without affecting the operating device for reduced system restart time. It should be noted that if operation confirmation is performed on all the devices by terminating the entire system or part of the system, this results in terminating the services to be provided, and thus this should be avoided.
For example, Patent Document 1 discloses a method in which a reliability test is performed as a method of detecting a fault in an I/O device in advance when a new hot plug I/O device is additionally connected to a computer system.
In addition, as a method of performing an operation confirmation on a pool device connected to an operating computer system, Patent Document 2 discloses a method in which a normal BIOS activation mechanism is used to periodically perform an operation confirmation on CELL (a board with a processor and main storage installed) which has been connected to the computer system but is not activated).
[Patent Document 1] JP Patent Publication (Kokai) No. 2004-326809
[Patent Document 2] JP Patent Publication (Kokai) No. 2006-268521