The present invention relates to a multi-computer system constituted by a plurality of computers and, more particularly, to a multi-computer system suitable for defining the system configuration and control contents, and performing an operation in accordance with the defined contents.
Conventionally, the definitions of the configuration and control contents of a multi-computer system constituted by a plurality of computers are generally described as an executable script (program) by the programmer directly or via a simple macro for each computer. Detailed contents of this executable script are a registration only command for the connection relationship between the respective computers, an extraction command for the state of each computer, and a command string which should be executed as an exceptional procedure upon detecting a fault in a CPU, a disk, a network, or the like. In addition to this, the exceptional procedure includes handoff of a server process to another computer, starting of a standby process, take-over of data, a file, or a network address to another computer, and the operation of a specific I/O device for an alarm output or the like.
As described above, the definitions of the configuration and control contents of the multi-computer system are conventionally described as an executable script by the programmer directly or via a simple macro for each computer.
The configuration and control contents, however, must be defined in consideration of all the situations conceivable, and the scripts are generally difficult to describe.
Script description misses such as inconsistency between the respective computers frequently occur.
If an exceptional procedure has a description miss or a mistake (semantic inconsistency), it is difficult to find. It is highly probable that such an error or an omission remain uncorrected until the system operates.
As the number of computers constituting the multi-computer system increases (to, e.g., three or more), the number of assumed fault patterns increases, so an omission easily occurs in describing a script for the procedure upon occurrence of a fault. This may lead to the worst case in which, upon occurrence of a fault, a correct process for circumventing the fault fails to abnormally stop the system.