The invention relates generally to the field of fault detection and localization in complex systems. More specifically, the invention is related to a method for context-aware anomaly detection in multivariate time series.
In an existing system invariant analysis technology, invariants are discovered from monitoring data of large-scale distributed systems these invariants are further used for fault detection and isolation. Each invariant profiles a constant relationship between two monitoring metrics and the invariant network is consisted of these monitoring metrics as nodes and their invariants as edges. With this approach, when a fault occurs inside a large system, many invariants will break due to the dependency of its components. Now given the set of broken invariants at a time point, the key question is how to rank the anomaly of monitoring metrics so that system operators can follow the rank to investigate the root-case in problem troubleshooting.
In previous U.S. Pat. Nos. 7,590,513 and 8,019,584, by inventors in this patent application, there was provides a system invariant analysis invention which discovers invariants from monitoring data of large-scale distributed systems and further use these invariants for fault detection and isolation. Each invariant profiles a constant relationship between two monitoring metrics and the invariant network is consisted of these monitoring metrics as nodes and their invariants as edges. With this approach, when a fault occurs inside a large system, many invariants will break due to the dependency of its components. Now given the set of broken invariants at a time point, the key question is how to rank the anomaly of monitoring metrics so that system operators can follow the rank to investigate the root-case in problem troubleshooting.
Accordingly, there is a need for a method for metric ranking in invariant networks in distributed systems.