1. Field
The present disclosure relates to techniques for estimating solder-joint longevity. More specifically, the present disclosure relates to techniques for estimating the reliability of solder joints in a ball grid array (BGA) in a computer system.
2. Related Art
Solder joints are widely used in computer systems as interconnects between components, such as: microprocessors, integrated circuits, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), and/or printed circuits. However, because of varying thermal and vibration conditions in computer systems, these solder joint interconnects can fail during operation of these components. For example, because of phenomena (such as different coefficients of thermal expansion and fatigue) varying thermal and vibration conditions can result in interconnect failures in BGAs, such as: interfacial joint fractures, bulk solder-joint fractures, metal pad lift, etc.
Interconnect failures can cause intermittent or permanent system faults and failures, thereby interrupting system operation. Furthermore, intermittent system faults are among the main causes of no-trouble-found (NTF) events, in which a customer returns a computer for repair but the manufacturer is unable to determine the source of the problem. NTF events are expensive, difficult to root cause, and strain relationships between manufacturers and customers.
Consequently, dynamic estimates (as a function of time) of the longevity of solder joints in individual systems, as well as in groups of systems, are a fundamental aspect of reliability prediction. To facilitate such longevity estimates, dedicated canary devices (i.e., allocating solder joints for use as canary devices) are often included in BGAs. However, additional circuits in ASICs or FPGAs are often needed to monitor such canary devices. Furthermore, in order to continuously monitor the integrity of a BGA, the canary devices can consume a sizeable fraction of the solder joints.
In addition, false and missed alarms can occur if the solder-joint longevity estimate is determined based solely on information obtained from canary devices. This is because of the stochastic nature of the signals monitored. In particular, even though canary solder joints are supposed to be under more stress and fail earlier than the non-canary solder joints, there are typically fewer canary solder joints than non-canary solder joints. For example, a first failure of canary solder joint may represent a 1% cumulative failure rate while the first failure of a non-canary solder joints may represent a 0.1% or less cumulative failure rate because the number of non-canary solder joints is typically more than ten times the number of canary solder joints. Therefore, it is likely that at least one of the non-canary solder joints that is not being monitored will fail before a canary solder joint.
Hence, what is needed is a technique for monitoring and estimating the longevity of solder joints without the above-described problems.