1) Field of the Invention
The present invention relates to a storage device that has the function of backing up data at a power failure. The present invention also relates to an information processing system such as a parallel computing system in which one or more processing units share the storage device.
2) Description of the Related Art
FIG. 3 is a block diagram illustrating the configuration of a parallel computing system acting as a general information processing system. Referring to FIG. 3, the parallel computing system includes plural processing units (processor elements) 1 and a single shared storage device shared by the processing units 1. The processing units 1 are connected to the single shared storage device 2 via a shared bus 3. The shared storage unit 2 has the configuration shown in FIG. 4 (to be described later) and receives a write request and a read request from each processing unit via the shared bus 3.
A storage medium 6 such as a magnetic tape unit is connected to the shared bus 3 via the adaptor 4. A storage medium 7 such as a magnetic disk unit is connected to the shared bus 3 via the adapter 5. Each processing unit 1 can access to the storage medium 6 via the shared bus 3 and the adaptor 4. The shared storage unit 2 can access to the storage medium 7 via the shared bus 3 and the adaptor 5. The adaptor 4 controls access to the storage medium 6, according to a write request or read request received via the shared bus 3. The adaptor 5 controls access to the storage medium 7, according to a write request or read request received via the shared bus 3.
Data storage area for data backup of the shared storage device 2 as well as data storage area for OS and application software, for example, are allocated to the storage medium 6. Data storage area for user's data, for example, is allocated to the storage medium 7. The media 6 and 7 are mounted in a cabinet housing the parallel computing system or mounted in a different cabinet from the former and of course are arranged separately from the shared storage device 2.
A power unit 8 is connected to each processing unit 1, the shared storage device 2, the adaptors 4 and 5, and storage media 6 and 7 via the power supply line 9 to supply power energy. An auxiliary power unit 10 is arranged additionally to the power unit 8. The auxiliary power unit 10 supplies auxiliary power energy for backup operation to the shared storage device 2, the adaptor 4, and the storage medium 6 via the power supply line 9. Referring to FIG. 3, the area where the auxiliary power unit 10 supplies backup auxiliary power energy is shown as a region surrounded with alternate long and short line.
As shown in FIGS. 4 and 5, the shared storage device 2 shown in FIG. 3 is formed of a shared memory unit 2a, a shared bus control unit 2b connected to the shared bus 3 for controlling transmission and reception via the shared bus 3, and a memory control unit 2c that controls access to the memory unit 2a based on a write request or read request to each processing unit 1 received via the shared bus 3 and the shared bus control unit 2b.
As shown in FIG. 5, the adaptor 4 shown in FIG. 3 consists of a shared bus control unit 4a connected to the shared bus 3 for controlling transmission and reception via the shared bus 3; and a storage medium control unit 4b that controls access to the storage medium 6 based on a write request or read request to each processing unit 1 or the shared storage device 2 received via the shared bus 3 and the shared bus control unit 4a. The adaptor 5 is formed similarly to the adaptor 4.
The data backup operation and restoring operation in the computing system will be explained below by referring to FIG. 5.
At a power failure, the auxiliary power unit 10 supplies auxiliary power energy. At the same time, as shown with the thick solid arrows in FIG. 5 (corresponding to a backup path), the memory control unit 2c in the shared storage device 2 reads out data stored in the shared memory unit 2a to send it to the shared bus control unit 2b. The shared bus control unit 2b sends data to the adaptor 4 to control access to the backup storage medium 6 via the shared bus 3. In the adaptor 4, when the shared bus control unit 4a receives data to be backed up, the storage medium control unit 4b receives the data and then writes it into the storage medium 6. This step enables the data backup operation of the shared storage device 2.
On the other hand, after a power failure restoring, as shown with the dotted arrows (corresponding to a restoring path) in FIG. 5, the storage medium control unit 4b in the adaptor 4 reads data out of the backup storage medium 6 and then sends it to the shared bus control unit 4a. The shared bus control unit 4a sends the data to the shared storage device 2 via the shared bus 3. In the shared storage device 2, when receiving data to be restored, the shared bus control unit 2b sends it to the memory control unit 2c. Then, the memory control unit 2c writes the data into the shared memory area 2a. This step enables the data restoring operation to the shared storage device 2.
There are the following problems in the prior art described above.
(1) The storage medium 6 that backs up data of the shared storage unit 2 is arranged separately from shared storage device 2. At the backup operation at a power failure, the backup auxiliary power unit 10 must supply continuously power energy till the power to the entire backup path (within the area surrounded with alternate long and short line in FIG. 3) has been completely backed up. Hence the capacity of the auxiliary power unit 10 must be made extremely large.
(2) Since data backup operation is made by way of the shared bus 3 and the adaptor 4, as described before with FIG. 5, the data transfer path becomes long, thus prolonging the backup time. This long data transfer path causes the larger capacity of the backup auxiliary power unit 10.
(3) The increased capacity of the auxiliary power unit 10 described above leads to its physically enlarged dimension. In the end, the cabinet for the computing system becomes larger.
(4) The increased capacity of the auxiliary power unit 10 takes much time until it is recharged after the auxiliary power unit 10 has been once discharged for a backup operation at a power failure. For that reason, if power failure should occur repeatedly at short intervals, the auxiliary power unit 10 may not be completely charged. In this case, the insufficient power supply to the backup auxiliary power unit may fail data backup operation.
(5) When data in the shared memory unit 2a is restored at a system re-startup operation, provided that the entire restoring path shown in FIG. 5 is not in an operable state after the startup operation, the data restoring operation cannot be performed by transferring backup data from the storage medium 6 to the shared memory unit 2a. Hence it is necessary to begin the restoring operation after waiting that the entire restoring path starts up and then is in an operable state. This leads to consuming the time during which data has been completely restored after recovering from a power failure.
(6) Where the system cannot be used because of a trouble on a part of the backup path or restoring path shown in FIG. 5, the backup operation or restoring operation described before cannot be performed.
(7) Since the backup data storing area (backup area) must be ensured on the storage medium 6, ensuring the area takes much time. Where the capacity of the shared memory unit 2a in the shared storage device 2 is increased due to enhancement and so forth it is needed to re-ensure the region where the increment data is backed up.
(8) Where the backup area is ensured on the storage medium 6 by previously considering an increase in the memory capacity at the enhancement described above, the backup area for the increased memory capacity becomes extra until the capacity of the shared memory unit 2a is actually increased. This means that the storage medium 6 is used wastefully.