The present invention relates to an apparatus and method for predicting future performance of semiconductor power modules, power devices, or high power integrated circuits for the purpose of planning the repair or replacement before actual breakdown. These power modules, power devices, or high power integrated circuits include power metal oxide silicon field effect transistors (MOSFETs), power insulated gate bipolar transistors (IGBTs), insulated gate thyristors (IGTHs), thyristors, microprocessors (MICROS), application specific integrated circuits (ASICS), or any other semiconductor devices which dissipate large amounts of energy, are hereinafter referred to as chips.
The increasing use of semiconductor chips demands the need for chips having reliable and predictable performance characteristics. Semiconductor chips play an important role in our modern day lives by providing component parts for computerized networks for business and banking, computerized medical equipment, integrated manufacturing plants, automobiles, communication systems, etc. For these applications, these chips are the basic building blocks which provide the means for performing various functions. The computerized network in a bank, for example, relies upon system hardware comprising semiconductor chips. If a critical semiconductor chip within this network fails, the entire network shuts down and ceases all electronic transactions. As such, it is important that semiconductor chips perform reliably and predictably when they are integrated into an apparatus having such an impact on our everyday lives.
However, today's chips are inherently limited by their physical size, lead frame design, die attach process, and bonding wires. As technology progresses, the internal device dimensions of a typical chip are often smaller, but the physical or actual size of the typical chip is becoming larger. The larger chip with smaller internal dimensions creates an even more powerful integrated circuit device. These larger chips include, for example, power devices, microprocessors, gate arrays, and the like. The larger chips may have lengths at about 1/2 inch and greater while their predecessors had lengths typically about 200 mils. As for their operating features, these chips may perform at high currents, high voltages, or high operating speeds. At these operating conditions, larger chips generate more heat than their smaller predecessors, thereby having higher operating temperatures. These characteristics exist because heat becomes more difficult to dissipate through the larger chip than the smaller predecessor chips. As the operating temperature increases, the chip's efficiency, predictability, and life expectancy decrease.
As for the lead frame, the chips are typically mounted or die attached onto a copper lead frame instead of the Alloy 42 industry standard. The industry shifted towards this copper lead frame because copper provides a higher thermal conductivity, lower cost, and smaller geometry for packaging via surface mount technology (i.e., Plastic Leadless Chip Carrier also called PLCC). However, in contrast to Alloy 42, the copper lead frame possesses a substantially larger coefficient of thermal expansion. At high temperatures, the die island underlying the chip will expand at a much faster and greater rate than the overlying silicon die, and thereby cause the die to stress, crack, and ultimately fail.
As the industry shifted away from the conventional gold/silicon eutectic die attach on the Alloy 42 lead frame and to the solder/silicon eutectic or silver paste epoxied die attach on a copper lead frame, these problems described herein occurred more frequently. For example, a power device's operating temperature increases from ambient to about 150.degree. C. after being turned-on. As the temperature increases, the die island comprising copper expands at a faster and greater rate than the overlying silicon die. This process will gradually decay a somewhat flexible interface comprising a eutectic solder/silicon (or silver paste epoxy) between the die and its island. As the interface decays over time, the thermal resistance between the die and its island gradually increases. This increases the voltage drop as well as the heat being dissipated, thereby decreasing the current flow. After repeated cold/hot heat cycling, the average current flowing through the device decreases. As this occurs, the device begins to run outside its operating limits, and eventually, the device fails. In contrast, the interface comprising a gold/silicon eutectic die attach provides a rigid bond between the die and its island. This rigid bond becomes problematic when the die island expands at a faster and greater rate than the overlying die during extreme increases in temperature. During these conditions, the die typically breaks and suddenly fails. Presently, since most chips rely on either the solder/silicon eutectic or silver paste die attach, the aforementioned problems are more prevalent.
Repeated temperature cycling also expands and compresses the bonding wires during device turn-on and turn-off. This process gradually causes the wires to become brittle, and eventually, the wires may even break. With today's power devices operating at their high temperatures or larger high pin count devices with hundreds of individually bonded wires, this problem occurs frequently.
Traditional approaches for detecting these problems to continuously characterize the future performance of a chip have been few to non-existent. A method for predicting the future performance of a chip include discrete measurements of operating currents and temperatures of the chip by an operator, technician, or engineer. After taking the measurement and comparing it to the manufacturer's specification, if the device is not outside of its specified limits, this person typically decides by "gut" instincts whether to replace the chip. Most often, the chip is only replaced when it fails. This procedure, also called break maintenance, is expensive and creates a variety of problems if the particular chip is integral to a critical apparatus.
Alternatively, if routine preventive maintenance is performed, an operator, technician, or engineer typically replaces the chip before any signs of failure exist. In particular, technicians often replace every high power switching device within a particular circuit in an attempt to prevent a chronic failure in the future. Since good chips are being replaced as well as bad chips, this method becomes expensive, inefficient, and time consuming.
There have been attempts to place a mixture of complex sensors, transducers, and measuring devices with a critical chip to measure its performance. Typically, this approach switches the device off when a predetermined state is reached for the variable being tracked. This predetermined state is often based upon the device specification as suggested by the manufacturer. This approach may also rely upon the temperature sensing technique for a power MOS device as disclosed in U.S. Pat. No. 5,063,307, issued Nov. 5, 1991, or the use of the current mirror technique for providing voltage, current, power, resistance, and temperature sensing capability as disclosed in U.S. Pat. No. 4,931,844, issued Jun. 5, 1990, which are both hereby incorporated by reference.
However, these attempts only monitor the snap-shot value of the current state of a single device parameter, and are generally ineffective for continuously tracking the high number of variables required for reliably predicting the chip's behavior. For a MOSFET, these variables include a thermal resistance R.sub.th, on-resistance R.sub.DSON, power supply current I.sub.DD and voltage V.sub.DD, and the like. Alternatively, an IGBT requires the tracking of R.sub.th, I.sub.DD, V.sub.DD as well as collector-to-emitter saturation voltage V.sub.CESAT, and the like. Since this approach is limited to measuring the snap-shot value of a single variable, it becomes too complicated to be realizable or practical with the high number of measurements needed for reliably predicting the future performance of the chip.