1. Field of the Invention
The present invention relates to techniques for proactively detecting impending problems in computer systems. More specifically, the present invention relates to a method and an apparatus for removing quantization effects in a quantized signal which can be subsequently used to detect impending problems in a computer system.
2. Related Art
Modern server computer systems are typically equipped with a significant number of sensors which monitor signals during the operation of the computer systems. Results from this monitoring process can be used to generate time series data for these signals which can subsequently be analyzed to determine how a computer system is operating. One particularly desirable application of this time series data is for purposes of “proactive fault monitoring” to identify leading indicators of component or system failures before the failures actually occur.
Unfortunately, many of these computer systems use low-resolution eight-bit analog-to-digital (A/D) converters in all of their physical sensors to sample the signals. This causes readings of physical variables such as voltage, current, and temperature to be highly quantized. Hence, the sampled signal values from these sensors can only assume discrete values, and no readings can be reported between these discrete values. For example, voltages for system board components may be quantized to the nearest 10 mV; e.g. 1.60 V, 1.61 V, 1.62 V, etc. Hence, if the true voltage value is 1.6035 V, it can only be reported as one of the quantized values, 1.60 or 1.61.
Note that the above-described quantization effects present a serious problem for proactive fault monitoring. Normally, one can apply statistical pattern recognition techniques to continuous signal values to detect if the signals start to drift away from steady-state values at a very early stage of system degradation. However, with significant quantization, conventional statistical pattern recognition techniques cannot be used effectively to detect the onset of subtle anomalies that might precede component or system failures.
To overcome the drawbacks of the low-resolution quantized signals, researchers have used a technique called “burst sampling.” Essentially, this technique restores high-resolution signals from low-resolution A/D converter outputs by removing the quantization effects. Specifically, a large “burst” of samples (typically hundreds of sample) are retrieved from low-level hardware registers of the server computer system being monitored. These samples are then collected through telemetry channels at the highest data rate that the hardware channels can support (typically at kHz rates). Next, the samples in the “burst” are averaged to obtain values that approximate signals sampled with high-resolution data-acquisition capability.
Unfortunately, this burst sampling technique can be used only for a small subset of signals of interest in a large system. This is because the burst sampling creates a large burst demand for the bandwidth that is available for delivering telemetry samples via the system bus. In some large systems over 1000 telemetry signals are monitored concurrently. However, the burst sampling technique can consume the entire system bus bandwidth while delivering only a few tens of these signals.
What is needed is a method and an apparatus that removes the quantization effects from low-resolution quantized signals without the above-described problems.