1. Field of the Invention
The invention relates to a high reliability system in which apparatus pairs are constructed with respect to a plurality of apparatuses connected through a network to thereby give redundancy, a redundant construction control method, and a program. More particularly, the invention relates to a high reliability system, a redundant construction control method, and a program in which a synchronizing process of the apparatus pairs is provided as a fixed function, service processes peculiar to the apparatus are provided as a variable function according to an exchange of software, and each apparatus can autonomously recover the redundancy in the case of an apparatus failure.
2. Description of the Related Arts
Hitherto, for the reduction of unexpected down-time in a processing system using a computer, it has been necessary to raise reliability of hardware itself or shorten the time that is required for a recovery process. There is a redundant system as a method of improving the reliability of the hardware. According to the redundant system, redundancy is given by multiplexing the hardware and allowing a plurality of apparatuses to have a portion which executes the same processing function, thereby enabling a process to be continued even if a certain portion fails. When such a redundant system is seen as a whole, the reliability of the original hardware can be theoretically improved by the power of the multiplexing number.
In the case of the failure of the redundant system, a process for recovering the system means a process for recovering the redundancy of the apparatus and it always includes some parts that require human intervention as follows. Therefore, there is a limitation in the reduction of such a process.                (1) Time after the occurrence of the fault till the watcher recognizes it        (2) Time after the recognition of the watcher till he liaises with a maintenance center        (3) Time after the liaison with the maintenance center till a maintenance clerk arrives at the site        (4) Time after the start of the maintenance operation till the completion of the recovery operation        
At present, for example, the following two methods are used to solve the problems in the system recovery mentioned above.                (1) Apparatus monitoring by the user        (2) 24-hour monitoring service by a remote monitoring apparatus, an automatic notifying apparatus, and a provider        
However, according to those methods, only the time for the watcher to recognize the occurrence of the fault and liaises with the maintenance center can be shortened in the recovery process and the following problems still exist. That is, costs mainly comprising personnel costs of the user himself and the system provider rise, it is indispensable to assure a communication path for automatically liaising with the maintenance center, and further, moving time necessary for the maintenance clerk to arrive has a bearing on a geographical factor such as a setting location of the maintenance center.
In the conventional system, the redundancy realizing method of duplexing the apparatuses and a method of automatically assembling spare apparatuses are each independent. For example, according to the conventional redundancy realizing method, even at the time of occurrence of the failure, the non-stop operation of service processes is realized by the redundancy of the service processes by software in addition to the redundancy of the hardware (refer to JP-A-2001-523855). However, nothing is considered about the automatization of the recovery of the redundancy after the exchange of the failed apparatus, artificial operations for writing software for providing the same service processes as those of the failed apparatus into the exchanged apparatus and starting a synchronizing process are needed after the exchange of the failed apparatus. There is such a problem that the recovery of the redundancy is troublesome and time-consuming.
According to conventional methods of automatically assembling spare apparatuses, redundancy of hardware and an automatic assembly of the spare apparatuses are realized (refer to JP-A-07-121395 and JP-A-2000-148709). However, there is such a problem that redundancy of service processes by software is not realized and if an apparatus fails, it is necessary to write back the software for realizing the service processes of the failed apparatus from a dedicated apparatus which collectively holds software for services into the spare apparatuses which have automatically been assembled, so that the service processes which had been provided by the failed apparatus have to be stopped meanwhile.