In modern society, computer systems have been recognized to be indispensable for local infrastructures that support our livelihood. Such computer systems are demanded to continue services without shutting down their operations for 24 hours a day. Work of building database for the core processing of on-line systems of banks is a good example. Such database-related work can be subjected to update around the clock, and they must not be allowed for complete shutdown.
A computer system which requires high reliability not permitting a complete shutdown is usually configured with an active computer which executes processes and a standby computer which takes over the processes when a failure occurs in the active computer. Procedures covering the stages from the watch of failure encountered in the active computer to taking over of processes by the standby computer are provided by a cluster program.
To enable taking over of processes by the standby computer when a failure occurs in the active computer, selection and decision of a computer acting as a standby computer from among clustered computers as well as taking over of data used by applications or the operating system (OS) in the active computer are mandatory. In addition, the failure watch procedures in the cluster program are so structured to initiate takeover of processes by the standby computer only when the failure watch procedures are repeatedly executed, which is to prevent occurrence of takeover of processes attributed to a temporary failure or a false failure watch.
A method of selecting a standby computer by which processes will be taken over in a clustered configuration is described in Japanese Patent Laid-open No. 2000-47894, for example. Referring to Japanese Patent Laid-open No. 2000-47894, a technology wherein a standby computer is determined based on CPU loads and available memory of each computer when a failure occurs in an active computer, and failover procedures are executed.
Examples of procedures for taking over processes include a method of starting an application program by a standby computer after occurrence of a failure in an active computer. This method is called “cold standby.” As opposed to the cold standby, the hot-standby method exists as a technology for speeding up takeover of processes. For example, referring to Japanese Patent Laid-open No. 8-221287, a technology, wherein a standby computer prefetches an application program to be taken over before occurrence of a failure in an active computer, to reduce failover time for taking over processes by the standby computer when a failure occurred in the active computer.
Referring to Japanese Patent Laid-open No. 2000-47894, in a clustered computer system, determination of a standby computer which is supposed to take over processes of an active computer when a failure occurs in the active computer is executed after execution of failover procedures is determined. On the other hand, Japanese Patent Laid-open No. 8-221287 states a method of speeding up processes required for takeover by arranging that the standby computer is supposed to prefetch the program to be taken over before occurrence of a failure in the active computer. In other words, the standby computer must read the program to be taken over before the failover procedures are executed.
Consequently, in a clustered computer system, to apply the hot-standby technology to realize high-speed failover procedures, all computers are required to prefetch all programs to be executed by respective computers. This, in turn, implies to consume computer resources, and thus causing a problem that operation of the application being processed by the computer becomes slower.