1. Field of the Invention
The present invention relates to a multi-cluster system including a plurality of computers, that is, clusters, and more specifically to a multi-cluster system capable of continuing operations, when one or more clusters forming part of a system become inoperable for any reason, by another cluster in the system replacing the inoperable cluster(s).
2. Description of the Related Art
A multi-cluster system includes a plurality of computers, that is, clusters, to process a database, etc. using, for example, a commonly available disk device.
On the other hand, there is a computer system for realizing a 24-hour operation by completely preventing the system from stopping using duplexed or triplexed CPUs (central processing unit) or input/output devices. For example, this computer system can be a fault-tolerant system. When some clusters in the multi-cluster system according to the present invention become inoperable for any reason, the processes being performed by those clusters are performed by other clusters in the system. Therefore, the process being performed by the clusters may be stopped only for a short time, but the total computer system can be continuously operated.
If a cluster becomes faulty when a process is being performed by such a multi-cluster system using a commonly available external storage device, it is necessary to pass detailed information containing an address in the external storage device being used by the faulty cluster in order to pass the process being performed by the faulty cluster to another cluster. This has been a problem in that the process being performed by a faulty cluster is necessarily inherited in complicated steps.
When a commonly available external storage device is formed by, for example, a plurality of hard disk devices, detailed information specifically indicating the hard disk device, partition, portion, etc. including the exact data storage position should be passed on for the process being performed by the faulty cluster to be inherited.
The present invention aims at enabling the inheriting of a process being executed by a faulty cluster when another cluster inherits the process being executed by the faulty cluster, by receiving a notification of a MAC (Media Access Control) address indicating the data storage area if a plurality of computers forming a multi-cluster system allows an identification symbol in the network, for example, the MAC address in a local area network, to correspond one to one to each cluster and each data storage area in an external storage device.
To solve the above described problems, the present invention includes, in an apparatus having a storage device, a storage area request unit for notifying the storage device side of the requested size of a storage area in the storage device and the identification information of the apparatus, for example, the MAC address, and a storage device access unit for accessing an area assigned to the apparatus in the storage device and specified by the identification information of the area.
In another aspect of the present invention, the apparatus is used as a computer which forms a part of a multi-cluster system having a plurality of such apparatuses.
The apparatus which forms a part of the system notifies the storage device side of the request size of a storage area in the storage device and the identification information of the apparatus. The apparatus accesses the area assigned to the apparatus in the storage device using the identification information of the apparatus. When an optional apparatus in a plurality of apparatuses becomes inoperable, the storage device access unit of another apparatus which inherits the process being performed by the inoperable apparatus accesses the area according to the identification symbol of an area assigned to the inoperable apparatus using the identification symbol of the inoperable apparatus. Then, the process can be easily inherited by reading the final status of the job, etc. stored in the area assigned to the inoperable apparatus.