1. Field of the Invention
The present invention relates to techniques for detecting degradation of components within a computer system. More specifically, the present invention relates to a method and apparatus for detecting the onset of degradation and for estimating the remaining useful life of interconnects within a computer system.
2. Related Art
An increasing number of businesses are using computer systems for mission-critical applications. In such applications, a component failure can have a devastating effect on the business. For example, the airline industry is critically dependent on computer systems that manage flight reservations, and would essentially cease to function if these systems failed. Hence, it is critically important to measure component reliabilities to ensure that they meet or exceed the reliability requirements of the computer system.
Unfortunately, determining the reliability of interconnects in high-end computer systems is a challenging task. Interconnects which are commonly found in memory modules, surface mount components, and integrated-circuit (IC) component sockets, are typically very high in density, which means there often exists hundreds to thousands of interconnects in a given component. When interconnects degrade or fail during the lifetime of the computer system, the failure can be difficult to troubleshoot and to isolate. Moreover, correcting issues related to interconnect failures can result in long equipment down times, which can severely impact the end user.
One solution to this problem is to monitor the computer system for interconnect faults. Unfortunately, present monitoring and surveillance techniques for interconnects are “reactive” in nature, providing a warning or actuating an alarm only after an interconnect failure has occurred. Presently, there are no techniques that allow “proactive” fault monitoring (i.e., providing an early warning of degradation) for interconnects within a computer system during operation of the computer system.
Hence, what is needed is a method and an apparatus for detecting the onset of degradation and for estimating the remaining useful life of interconnects within a computer system without the problems described above.