Embodiments of the invention are directed to techniques which may be used as part of a predictive modeling analysis. More specifically, embodiments of the invention provide methods and systems for evaluating performance metrics of a computing system using a multiple modeling paradigm.
In large scale computing deployments, one common resiliency problem is solving what is referred to as “soft failures,” where a computing system does not crash, but simply stops working correctly or slows down to a point of being effectively non-functional. Predictive analysis is a technique used to identify when a current set of sampled metrics for a computing system indicates that a future event is likely to occur (e.g., to predict when a soft failure is likely to occur). Predictive analysis tools rely on historical data to derive a model of expected system behavior.
An important aspect of such tools is the capability to avoid false positives. A false positive occurs when the predictive analysis tool detects a problem and warns a user, but the behavior is actually normal system behavior. False positives can significantly reduce a user's confidence in the predictive analytics tool. In large computer systems, many tasks or jobs may be running whose behavior is “spikey,” meaning the activity rate may vary drastically depending on workload and time of day, day of week, etc. Predictive analytic tools analyze historical data collected on a system and use machine learning algorithms to identify abnormal behavior on a system. For example, regular periodic processing (weekly, bi-weekly, monthly, etc.) can cause normal spikes in activity that could be erroneously identified as abnormal behavior by the predictive analytic tools. Jobs or processes which exhibit “spikey” behavior tend to generate false positives, because the spikes tend to exceed consumption thresholds set using average consumption rates. Further, the timing of a spike may not follow a pattern that is detectable by pattern recognition algorithms due to a varying number of days in the month, weekends, holidays, etc.