Computer systems generally include a number of components that are electrically connected to one another. These components include one or more processors, memory devices, input/output (I/O) devices, and controllers for the memory and I/O devices. One or more power supplies in a computer system typically provide power to the components in the system. The power is generally provided to components using a constant, direct current (DC) voltage at a particular voltage level, e.g., 5.0 volts (V).
In efforts to ensure the reliability of a component, manufacturers often test components of a computer system over a range that is near the nominal operating voltage of the component. For example, a manufacturer may test a component over a range of +/−10% of an operating voltage of a component. By testing components at different voltage levels, manufacturers may identify components that fail at various voltage margins. Because components that fail at the voltage margins will likely eventually fail at the operating voltage, a manufacturer may label such components as defective.
In actual use in a computer system, the range of voltages where a component operates without failing may gradually narrow over time. In addition, a voltage level provided to a component by a power supply may vary with temperature or other environmental factors. Under certain circumstances, the voltage level provided to a component may fall outside of an operable voltage range of the component and the component may fail. Furthermore, components can weaken over time due to latent defects. As stated earlier, these defects can be detected early through voltage margining. Computer systems typically do not include mechanisms for testing components over a range of voltages during normal operation. As a result, component failures may not be detected until they cause undesirable results such as a crash of the computer system.
Accordingly, it would be desirable to be able to predict component failures in a computer system in a ‘planned’ manner before the failures cause undesirable results during operation of the system.