Such a computer system has already been realized which is provided with a normal system (a working system) and a standby system in order to maintain the functions of the computer system even in case of a disaster. For example, the EMC Corporation has developed a system which effects mirroring using a normal storage unit system and a standby storage unit system. The information concerning such system is disclosed on the Internet in the URL of “http://www.emc2.co.jp/local/ja/JP/products/product_pdfs/srdf/srdf.pdf”.
In Japanese Patent Kokai Publication JP-P2000-305856A, there is disclosed a system which achieves a dual data system between a main center (normal system) and a remote center (stand-by system).
In a normal system, in general, a storage unit of a normal system and a host computer of the normal system employing the storage unit are interconnected. The same applies for the standby system. The normal system and the standby operation system are interconnected over a leased line or a communication network, such as the Internet.
In the system disclosed in “http://www.emc2.co.jp/local/ja/JP/products/product_pdfs/srdf/srdf.pdf”, or in the system described in the Japanese Patent Kokai Publication JP-P2000-305856A, data is directly transmitted from the normal system to the standby system. However, there are also occasions where the normal system sends data to a relaying device, which relaying device then sends the data to the standby system. In general, the relaying device sends data received from the normal system to the standby system and, on receipt of the notice for completion of reception from the standby system, the relaying device notifies the normal system of the completion of the data transfer to the standby system. When the normal system has transmitted data to the relaying device, the normal system commences the next operation after receipt of a notice of the effect of the end of data transfer from the relaying device.
In general, in transmitting/receiving data over a communication network, the data is divided and transmitted/received in terms of data transmitting units. The data transmitting units differ from one communication protocol to another. In the data transfer unit, such as a packet, there are contained not only data being transmitted, but also error correction codes for detecting data errors (garbled data) produced in the course of transmission. The error correction codes may be exemplified by check sum data and CRC (cyclic redundancy check) data. If a device which has received a packet of data detects a data error by the error detection code, the device discards the packet.
Even if no error is incurred in data, but if congestion (the state of high communication load) occurs on the communication network, the packet is discarded on the communication network. If a packet is discarded, and no response is obtained from a destination of transmission, a source of the transmission again sends the packet. Meanwhile, if congestion has occurred on the communication network, and packet discarding is commenced, the communication load may further be increased due to re-transmission from the source of the transmission, or the rate of network utilization may be lowered. In order to avoid this problem, such a system has come to be used in which a device forming a communication network optionally discards a packet when the communication load has exceeded a preset threshold value. This system is termed a RED (Random Early Detection) system.
There is also proposed a data transfer system in which a device which has received data is able to correct data errors. For example, in the Japanese Patent Kokai Publication JP-A-57-138237, there is disclosed such a data transfer system in which data to be transferred and parity bits formulated from the data are separately transmitted and a device which has received the data to be transferred and the parity bits executes error correction.
In the following, mirroring and backup are demarcated as follows from each other. “Mirroring” is defined as meaning writing the same data in two or more storage units, to one of which a write command has been output from a host, with the outputting of the write command as an incentive for writing the same data in the two or more storage units. The replication is also treated as being a sort of the mirroring. On the other hand, “backup” is defined as writing the contents of a given storage unit in another storage unit, at an optional timing, that is, the write command for a given storage unit is not used as a trigger for the backup.
In a computer system, there has also been implemented a technique in which a program being executed is suspended at an optional time point and re-started later. For example, in a super-computer manufactured and sold by the present assignee under the title of SX series, after the state of execution of a process at an optional time point, such as the state of memories or registers, is saved, the program is suspended and re-started later. This function is sometimes termed checkpoint re-start function.