Anomaly detection involves systems and processes for identifying behavior that does not conform to expectations. On enterprise and cloud computing platforms, for instance, anomaly detection may provide warnings if unusual behavior is exhibited by metric data collected from system hardware and software resources. If left unaddressed, anomalous behavior may compromise system security and performance. Anomaly detection systems attempt to mitigate such performance degradation by detecting and treating anomalies as efficiently as possible.
Anomaly detection is complicated by the significant variance in behavior from one system to the next. For example, a typical pattern of resource usage in one datacenter environment may have different patterns of highs and lows than in another datacenter environment. As a result, behavior that is anomalous in one computing environment may not be anomalous in another environment.
Threshold-based alerting is an example approach to anomaly detection. According to this approach, a user defines the acceptable range of values, and an alarm is triggered if a monitored value falls outside the user-defined range. The user may define the thresholds based on specific domain knowledge of the system to supervise the anomaly detection process such that the thresholds are tailored for specific behavior exhibited by the system. This approach allows the user to inject domain knowledge into the system to supervise the anomaly detection process. However, selecting the appropriate thresholds on which to trigger alerts may be complicated based on the large number of anomalous events that may occur in large-scale systems. Adding to the complexity, system resources may exhibit trends, seasonal fluctuations, and other time-varying behaviors that evolve over time. A user may be unaware of and/or unable to keep up with normal behavioral patterns. As a result, users may be prone to selecting sub-optimal thresholds, which may result in false positive alerts that flag system normal system behavior and/or neglect other system behavior that is anomalous.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.