1. Field of the Invention
The present invention relates to techniques for proactively detecting impending problems in computer systems. More specifically, the present invention relates to a method and an apparatus for quantitatively determining the severity of degradation in a signal in a computer system.
2. Related Art
Modern computer server systems are typically equipped with a significant number of sensors which monitor signals during the operation of the computer systems. Results from this monitoring process can be used to generate time series data for these signals which can subsequently be analyzed to determine how well a computer system is operating. One particularly useful application of this analysis process is for “proactive fault-monitoring,” to identify leading indicators of component or system failures before the failures actually occur.
Unfortunately, all existing proactive fault-monitoring systems have a serious limitation: they can only indicate that there are anomalies in the monitored signals, but provide no information on the degree or the severity of the degradation. For example, existing proactive fault-monitoring systems can either flag a component of a system to be at risk or not at risk, but cannot determine the level of the risk.
However, it is of tremendous interest to service engineers to have the knowledge of the degree or severity of degradation in the monitored systems. A quantitative indicator of the amount of degradation allows the service engineer to make appropriate decisions based on the actual health of the system with high confidence. For example, if a system is scheduled for shutdown due to a preventative maintenance on Saturday night and a warning flag is generated on Friday afternoon, it would be extremely beneficial for the service engineer to know if the detected degradation is of extremely low severity, so that the system can be allowed to operate safely until the scheduled outage time. On the other hand, if there is no scheduled shutdown in the near future and a warning flag is generated, the service engineer may desire to shutdown the system immediately if he/she knows that severity of the detected degradation is extremely high.
Hence, what is needed is a method and an apparatus for quantitatively determining the severity of degradation in a signal when the degradation is detected.