In many current electronic devices, such as high performance computer servers for example, air cooling remains the predominant method for removing heat and keeping temperatures within acceptable limits. Natural convection cannot provide the necessary heat transfer for high power CPUs (computer processing units) and GPUs (graphics processing units) due to the large heat sink area required, so forced convection using fans remains the dominant solution for cooling for such systems.
Cooling fans, being rotating electromechanical devices typically have an electrical motor and fan blades, are prone to failure. When they do fail, they can take down entire systems by allowing the systems to overheat and thereby damaging or destroying heat-sensitive components. Furthermore, their performance can degrade over time, causing a slow decrease in system performance resulting in increased temperatures. Also, dust and dirt build up on the fan blades can slow their rotation rate, reduce air flow from the fan, and create fan blade imbalances that increase bearing wear. Bearing lubricants can dry-up, resulting in increased friction and power consumption, and eventually motor failure.
Given their potential for failure and diminished performance over time, it can be important, from a system reliability standpoint, to monitor fan performance and predict the potential failure of these devices. Prior art systems have generally relied upon the monitoring of fan rotation rates (RPM), indicating a fault when the rotation rate falls below a fixed RPM threshold.
FIG. 1 is a graph 100 of fan speed versus time for a prior art system for detecting fan failures. Curve 102 illustrates how fan speed may decrease over time, finally dropping below fixed threshold 104 at point 106. This system may have limited suitability for fans run continuously at full power or maximum rotational rate.
Most modern systems run fans at varying rotation speeds, depending on the heat generated by the load being cooled, moderated by concerns for noise. To reduce the noise produced by the fans, they may be throttled-back when the systems being cooled are idle or are at low power consumption levels. A fixed threshold for determining potential fan failure may be impractical or unsuitable for such systems. What is desirable is a fan monitoring system that can determine whether a fan is operating within specified limits throughout its operating range, with the potential to predict fan failures before they occur so that preventative action can be taken.
These and other limitations of the prior art will become apparent to those of skill in the art upon a reading of the following descriptions and a study of the several figures of the drawing.