The present invention relates to mechanisms for supporting detection of a failure event to maintain a high degree of failure detection accuracy without unduly increasing the amount of symptom storage.
With the rapid development of computer technology today, computer systems are naturally incorporated in backbone systems constructing a social infrastructure. In order to operate the social infrastructure normally at all times, considerable operation costs are required. An autonomic computing system has attracted attention as a technique for reducing the operation costs as much as possible and increasing the degree of system stability.
The autonomic computing system is a generic term describing all major areas of technology for constructing a system-scale, self-managing environment, which means an entire system for detecting a problem or failure that arises in a system and autonomously eliminating the problem or failure. Various methods for detecting a problem or failure that arises in a system are disclosed.
For example, a method exists for root cause identification in which part of a dependency model related to a subject structural element and other structural elements upon which the subject structural element depends is scanned to identify a root cause of a condition of the subject structural element including a failure in order to determine a condition status associated with each of the structural elements. Further, a dependency management method exists for managing dependency information among various components of a computing environment, especially for managing runtime dependencies.
However, though the method for root cause identification can detect a root cause with a high degree of accuracy by scanning the dependency model everywhere from the upstream to the downstream, if the dependency model is complicated, scanning itself requires significant time, and since the order of scanning the dependency model is not specified, there is a problem that may cause reduction of performance and usability.
Further, the dependency model is often constructed in the form of a logical formula with event parameters. For example, though the dependency information among components is managed, the dependency information does not include system configuration information. Therefore, even if dependency models to be scanned are narrowed down from the logical formula, there is a possibility of detecting a wrong root cause, and hence there is a problem of making it difficult to improve detection accuracy.