1. Field of the Invention
The present invention relates to a data processing system provided with multiplexed boot devices storing an operating system, more particularly relates to a data processing system linked with system software for multiplexing boot devices so that when startup by a boot device of a master system is not possible, a boot device of a slave system is forcibly switched to and startup is enabled from the boot device of the slave system in the next boot.
2. Description of the Related Art
In general, a computer or other data processing system is configured to start up the operating system from a hard disk or other boot device in response to a startup signal. In the case of such a configuration, when the boot device is broken, the operating system can no longer be started. In particular, if such trouble occurs in a computer, if that computer is connected to other computers on a network, the problem arises that all service to the other computers will end up being stopped.
On the other hand, recently, as one technology for raising the fault tolerance of data processing systems, the practice as been to configure a plurality of hard disks by a redundant arrays of inexpensive disks (RAID) to achieve redundancy. This RAID configuration, however, is technology for raising the reliability of read out data by mirroring or striping and does not raise the reliability of operation for system startup.
Therefore, a data processing system has been developed with multiplexed, for example, duplexed disk devices as boot devices. When detecting that any of the disk devices has been broken, a switching operation is performed to switch to another boot device from among the plurality of disk devices provided in the system so as to start up the computer system. With such a data processing system, however, the switch operation for this switching is performed manually by a system manager etc. Further, the judgment as to if the system has been started up normally or has failed to start up is also performed by a human operator.
With such a data processing system, when a disk device is broken, time is wasted for starting up the system again. This hinders system operations. Therefore, a data processing system has been proposed where when one of duplexed disk devices has been destroyed, the other disk device is automatically switched to so as to sufficiently raise the reliability of operation in system startup.
The method of switching for duplexed disk devices in such a data processing system is for example disclosed in Japanese Unexamined Patent Publication (Kokai) No. 3-229331. When a control unit provided with the operating system of the data processing system detects that one of the disk devices has broken down, it automatically stores initial microprogram load (IMPL) disk data and initial program load (IPL) disk device data in a nonvolatile memory provided in the system. In this example, since the nonvolatile memory automatically stores the IMPL disk data and IPL disk device data, the control unit can determine the disk device data to be stored next and can automatically switch to another normal disk device.
Further, another example is disclosed in Japanese Unexamined Patent Publication (Kokai) No. 2000-81978. In the data processing system of this example, when failure occurs in one of the duplexed disk devices storing the operating systems, the time until recovery from the failure in the disk device is shortened. With this data processing system, when one disk device is broken, the system is ended once and, before restart, switching is performed and the system registered so that the other unbroken disk device becomes the master, while the broken disk device becomes the slave. Due to this, when the system is restarted, the operating system is started by the IPL of the master unbroken disk device and the system starts up.
Further, another example is disclosed in Japanese Unexamined Patent Publication (Kokai) No. 2002-259130. In the data processing system of this example, provision is made of duplexed disk devices individually storing the operating system, a means for starting up the operating system, and a means for detecting completion of startup of the operating system. The time elapsed from when the startup signal for starting up the operating system has been generated is counted and control performed for switching the boot disk device for starting up the operating system based on whether completion of startup of the operating system has been detected in the predetermined elapsed time from the generation of the startup signal.
In this data processing system, it is possible to automatically detect completion of startup of the operating system and possible to detect completion of startup of the operating system within a predetermined elapsed time from the generation of the startup signal. The switching of the boot disk device for startup of the operating system is automatically controlled based on whether or not completion of startup of the operating system has been detected. Therefore, even without human intervention, judgment of whether the operating system could be started up normally and restart of the operating system from the other boot disk device can all be automatically performed.
On the other hand, each of the duplexed boot disk devices in such a data processing system stores the various programs for starting up the disk devices. Each of the disk devices, for example, stores, from the start block of the disk drive, a boot block program, operating system loader, operating system, and system software and after that stores data.
At the time of startup of the data processing system, the boot firmware stored in the nonvolatile memory in the system is loaded into the main memory. Next, due to the operation of this boot firmware, a boot disk device is selected. The boot block program stored in the selected disk device is read out and loaded into the main memory. This boot block program loads the operating system loader stored in the boot disk device into the main memory. Next, the operating system loader reads the operating system into the main memory, whereupon the operating system loads the system software stored in the boot disk device into the main memory. Which of the master system or slave system of the duplexed disk devices this system software has been started up from is checked for control of duplexing. Due to the startup of the system software, the duplexed disk devices are subsequently switched at the time of normal operation of the operating system.
Summarizing the problem to be solved by the invention, in the data processing system of the related art, the system hardware is duplexed so as to impart redundancy against hardware errors occurring when accessing the system hardware at time of operation and thereby enhance reliability. In this data processing system, redundancy is realized by processing of the system software stored in the boot disk devices. Therefore, in processing for starting up the system, the processing proceeds once until startup of the system software, then in the subsequent processing even if error occurs in accessing the duplexed system disk devices, continued operation becomes possible using the data of the one not suffering from error.
Further, when error frequently occurs in the system disk device of the master system among the duplexed systems, it is possible to cut off the disk device of the master system and then boot the system from the system disk device of the slave system. At the time of processing for starting up the system, however, even if the boot disk devices are duplexed, switching triggers detection of an abnormality in the disk device of the master system. Therefore, if an abnormality occurs in the system disk device of the master system before startup of the system software stored in the disk device of the master system proceeds, it is not possible to operate the system software for control of duplexing. Therefore, the problem arises that even if the disk devices are duplexed, if an abnormality occurs in the disk device of the master system, it is not possible to switch to the disk device of the slave system, so it is not possible to start up the system.