In a high performance computing (HPC) system that executes recent advanced scientific computing, a demand has been increased year by year that a large number of calculation servers is managed and operated in parallel as compared with a technology in a related art, due to a request for calculation processing performance of the whole system.
In the HPC system that includes such a large number of servers, it is desirable that a halting time of the system is reduced, and a running time is increased. Thus, a system with high availability is employed in which a main server such as a file server employs a redundant structure, and switching from an operation system to a standby system is performed when abnormality occurs, and a continued operation is allowed. The switching from the operation system to the standby system is called failover.
On the other hand, in order to increase the performance in the HPC system, the number of calculation servers that execute calculation processing is also increased proportionally, so that reduction in a communication load is desired in a network within the system. In the HPC system, for example, 80000 calculation servers may be included.
Therefore, in the related art, system monitoring is performed using a layer structure in order to monitor the state of a server group that includes a calculation server and a file server of a large-scale HPC system.
For example, as illustrated in FIG. 1, a monitoring master server that monitors the whole system is provided, for example, on the top layer in a layer structure such as a tree structure, and a plurality of monitoring sub-master servers that are management repeaters is provided on the second layer, a plurality of servers that are monitored, that is, a file server and a calculation server in the example of FIG. 1 are provided on the lowest layer. That is, the monitoring master server monitors the plurality of monitoring sub-master servers, and the monitoring sub-master servers monitor the calculation server and the file server that are monitored servers under the control of the monitoring sub-master servers. In the example of FIG. 1, a file server A and a file server B correspond to a failover pair.
In the example of FIG. 1, for example, each of the calculation server and the file server that are monitored servers includes a service monitoring daemon that monitors service in the server, for example, service for job operation at certain intervals, for example, 60 second intervals. For example, when abnormality occurs in the file server A, the file server A transmits state change notification that is used to notify the monitoring sub-master server of a state change of the file server A to a down state due to the occurrence of the abnormality, at a next monitoring timing (FIG. 2: 1000). The monitoring sub-master server does not immediately transfer the state change notification to the monitoring master server, but holds the state change notification in the monitoring sub-master server for a certain time period, for example, for 30 seconds (FIG. 2: 1010). The holding of the state change notification for the certain time period is called “cache”. In addition, such cache is also called “state change notification cache”.
The state change notification cache is a technology to cache the state change notification for the certain time period and reduce a network load because a load is applied to the network when packets for the state change notification are transmitted and received to and from a server in a upper layer and a server in a lower layer in the layer structure such as the tree structure of the large scale HPC system when start-up at the same time and shutdown at the same time are performed in the system.
After the certain time period elapses, the monitoring sub-master server transmits the cache state change notification that is used to notify the other servers of the state change of the file server A, to the monitoring master server (FIG. 2: 1020). The monitoring master server does not immediately execute processing even after the monitoring master server receives the state change notification, and caches the state change notification for a certain time period, for example, for 30 seconds (FIG. 2: 1030). After the certain time period elapses, the monitoring master server transmits the state change notification to the two monitoring sub-master servers (FIG. 2: 1040).
The monitoring sub-master server does not immediately execute processing even after the monitoring sub-master server receives the state change notification, and caches the state change notification in the monitoring sub-master server for a certain time period, for example, for 30 seconds (FIG. 2: 1050). After the certain time period elapses, the monitoring sub-master server transmits the state change notification to the monitored servers other than the file server A (FIG. 2: 1060 and 1070).
In the example of FIG. 2, when the file server B that is paired with the file server A as the failover pair receives the state change notification, the file server B starts failover at the point of 150 seconds, but the file server B detects a state that is failover due to the state change notification merely at a next monitoring timing, that is, after 30 seconds. After that, it takes the same time to perform propagation of the state change notification of “failover”, and propagation of the state change notification from “failover” to “double” (service biased state). That is, it takes about 390 seconds to complete switching of the file server.
As described above, in the switching processing of the server, it takes a long time due to the monitoring time interval in each of the monitored servers and the state change notification cache, so that, as a result, an operation halting time is increased regardless of important state change.
There is a technology by which a monitoring time interval of a monitored device by a monitoring device is dynamically changed, and the monitoring device instructs the monitored device to change the monitoring time interval, but a large management load is applied to the monitoring device.
The technologies in the related art are discussed in Japanese Laid-open Patent Publication No. 61-221542 and Japanese Laid-open Patent Publication No. 9-83641.