The present invention relates to a computer system and a detecting method for detecting a sign of a failure of the computer system. In particular, the present invention relates to a computer system and a detecting method for detecting a sign of a failure of the computer system capable of detecting a sign of failure of an application (AP), an operating system (OS) and hardware (HW) in its own system.
In general, an AP or an OS in a computer system sometimes fails and stops for various reasons such as a defect included in the AP or OS itself or a failure in a device used by the OS.
As a conventional technique concerning a technique for detecting a failure in an AP in the case where it is necessary to continue a function provided by the AP even when a failure as described has occurred, there is known a technique called heart beat whereby the time required until the processing is finished is monitored and a decision is made whether exchange of communication data has been completed within a predetermined time by using a watchdog timer. As another conventional technique, a technique of monitoring a log issued periodically by a system and detecting occurrence of a failure is known. As a conventional technique concerning the heart beat now in use in typical HA clusters, for example, a technique disclosed in http://www.atmarkit.co.jp/flinux/rensai/cluster02/clust er02.html is known.