Systems may have numerous sources of faults, ranging from equipment failures to computer hardware failures to software failures to operator errors. In complex systems, there are many dependencies between interconnected components. Mechanisms for monitoring systems may also be subject to failure as well. Because of dependencies, the failure of one component may lead to another indicating a fault condition and/or symptom. Cascading faults may lead to a large number of alerts, making the task of determining a root cause fault quite difficult. As referred herein, these extra alerts are “symptoms” of the root cause fault.
Prior art approaches to automated route cause analysis have tried to find root causes by looking for statistical correlation between faults, assuming that a strongly correlated fault is the root cause. However, correlation may not indicate causation. Another related statistical approach is to use machine learning techniques to “recognize” different failure scenarios. However, the reliability of this approach is low unless a very large collection of labelled training sets are available which may be expensive and/or impractical.