In a typical distributed processing system which allows a plurality of processing nodes to execute tasks distributed thereto via a network, when an abnormal state of one of the processing nodes has been detected, a management node reallocates tasks having been allocated thereto to a normal processing node, and this method enables improvement of the availability of the entire system.
Well-known methods for detecting such an abnormal state of a processing node include a polling method, in which a management node inquires of each of processing nodes for normality thereof and detects an abnormal state thereof when a reply therefrom is not normal; a heartbeat method, in which each of processing nodes notifies a management node of its own state at predetermined constant intervals; and the like. In these methods, an abnormal state of a processing node is likely to be erroneously detected although allocated tasks are being normally executed in the processing node because of delays of replies or notifications, or the like, due to a delay of a network and/or execution of task processing which needs a large number of resources of the processing node.
In order to reduce the occurrences of such an erroneous detection, a method of causing each of processing nodes to sufficiently reserve resources allocated to processing for replying to polling or resources allocated to processing for notifying of heartbeats is conceived. In this case, however, the number of resources which can be allocated to task processing is decreased by a number equal to the number of resources allocated to such reply processing or notification processing.
In addition, as a related technology, in Japanese Patent Application Laid-Open Publication No. 2001-086572, there is disclosed a technology which confirms an operational state of each of a plurality of devices by collecting a power consumption amount of the each of the devices. In the case where this technology is applied to the above detection of an abnormal state of a processing node, an energized state of the processing node can be detected, but it is difficult to detect whether or not allocated tasks are being normally executed in the processing node.