1. Field of the Invention
The present invention pertains generally to an electronic component test process to screen for infant mortality.
2. Description of the Background
It is common for components to be exposed to stress testing. Most manufacturers stress test the component parts of their product in order to determine the life of the product in their customer environment, determine the mean time before failure (MTBF), and suitability of the product for its intended use.
Without stress testing, weak components would sort themselves out by early failure during use by the customer. The product manufacturer may still be liable for warranty and such a failure would require field service by the product manufacturer. Diagnosis and repair at a customer site can be extremely expensive. Even if the failure occurs beyond the warranty period of the product, if the product does not meet customer expectations then the customer becomes dissatisfied. This often results in loss of future sales for the product manufacturer. Dissatisfied customers communicate with other customers and potential customers. The experience of others will often induce a reluctance to purchase the products, resulting in loss of market share.
Electronic products have become complex with many components. The reliability of the resulting product is a function of the reliability of the components. In fact, the reliability is the mathematical product of the reliability of each of the components. For instance, if each component is 99% reliable for the first 1000 hours of operation and there are 100 components then the product is (0.99) to the 100 power, which is 36.6% reliable. This would mean that about two thirds of the final product would fail within 1000 hours of operation. Many electronic products have thousands of components, so to achieve the reliabilty required by customers, each component must have a very high reliability for a long period of time.
The reliability of a component or product can be depicted by the familiar bathtub curve. The bathtub curve is failure rate versus time. It has three regions: the early failures known as infant mortalities, the normal life failures where the reliability is usually the highest (lowest level of failures), and the wearout failures where reliability decreases and consequently failures increase. The goal is to remove all infant mortalities and deliver product to the customer which is in the normal life failure rate. The problem is that many infant mortalities occur after the component has passed a test sort. The component passes an initial test, but fails within a time span relatively soon afterward compared to the expected life of the product. Many electronic components like integrated circuits, semiconductor devices, fineline resistor networks, and thin film and thick film hybrids have expected operational lifetimes of 20 years or more with very high normal life reliability. This means that failures which occur within the first year are generally considered infant mortalities. Because of this, in order to sort out potential infant mortalities at normal operating conditions, it would require testing every part (not just sampling) for a long period of time, like one year. This is not practical and definitely not cost-effective for most electronic components.
Stress testing, also called burn-in or accelerated life test, was developed in order to shorten the length of time required to test components and sort out the potential early failures. Stress testing is a method to accelerate the life of the component. The theory is to expose the component to all the stresses it would experience in a shorter time. For example, a component that would operate for 1,000 hours before failing under normal conditions may fail in 20 hours if the amount of stress of the 1,000 hour period could be condensed into the 20 hours. There are various ways to condense the stress into a shorter period of time so that a component does not have to be tested for 1,000 hours to see if it meets the reliability requirements. There are also many theories about how much these different ways affect certain failure mechanisms so an acceleration factor can be derived either empirically or by calculation. BURN-IN by Finn Jensen and Niels Erik Petersen, published by John Wiley & Sons Ltd. 1982 discusses many of these theories and calculations.
Some of the ways to condense the stress are to exercise the component more frequently, raise or lower the operating temperature during operation, raise or lower the humidity, vibrate or shock the component, exercise the component with abnormal operating voltages and current, and many other adverse conditions and combinations of these intended to increase the stress in a short period of time compared to normal operating conditions, so as to mimic normal operating conditions for a long period of time. There are limits to the stress conditions. There are points for every condition in which by going beyond the limit, it ceases to be a mimic of normal operating conditions for a long period of time, but a short-term destructive episode. For example, the temperature is elevated to the point of combustion and the plastic package starts to burn. Clearly, a good part will last the required reliability expectations, but even a good part will catch fire when exposed to the temperature of combustion for one of its sub-components even if it would have lasted a very long time at temperatures just below the combustion temperature. Phase transitions and reaching activation energies for certain chemical processes change the model of life acceleration.
Also, manufacturers use the results of stress testing to calculate total ownership cost of the product to their customers, expected downtime, expected spare part inventory required in quantity and type, and expected economic lifetime of the product. At the development stage, stress testing is used to evaluate different components and make decisions between alternative designs and components.
After a design is determined, certain types of stress testing, nondestructive testing, are often used on components which will ultimately be incorporated into products going to customers. The purpose of performing stress tests at this step in the manufacturing process is to sort out the components which will fail before or soon after the product reaches the customer, components which do not meet the expected life or mean time before failure. In this case, it is important that the stress test does not weaken the component, but accurately sorts out the bad ones from the good ones. Tests which sort out all the bad ones, but also some good ones increase the manufacturing cost of the product, Type I error. Tests which only sort out most of the bad ones, but leave some bad ones classified as good, Type II error, increase the rework rate and reduce the reliability of the product.
In reality both Type I and II errors are present; most tests will sort out most of the bad components, leave some bad components classified as good and classify some good components as bad, thus increasing manufacturing costs from the ideal and reducing the reliability of the product from the ideal. This is not a perfect world, however stress testing is one of the processes used to approximate a perfect world.
For electronic components, stress testing is usually done at two different times in the life cycle of a component. The first time, development test, is when the component is first designed and the manufacurer of the component wants to determine whether the component meets expectations of design and manufacturability and to get an estimate on the potential yield of the component. The second time, qualification test, is after the manufacturer of the product receives the component from the component manufacturer when the product is being developed. The product manufacturer will do an incoming stress test on the component or more typically, do a stress test on the product or major portion of the product which will involve stressing the component along with other different components to determine whether the component suffices for the product's intended purpose. Next, the component manufacturer will perform production tests, sending only good parts to the product manufacturer. The product manufacturer may then, implement screening tests for particular characteristics to ensure high reliability.
Product manufacturers have been looking for less costly and time-consuming stress tests used at the qualification or screening test. Product manufacturers would like to eliminate stress testing, but maintain low failure rate product shipped to their customers. One method has been to try to educate the component manufacturer to deliver a component which does not need to be stress tested. Many improvements in component manufacture have happened. However, as the component reliability increases so does the customer expectations, so stress testing is not eliminated, but the acceptance criteria is changed to reflect the demand for higher quality parts. Moving the stress test from the product manufacturer to the end process of the component manufacturer can be a slight improvement since it reduces the communication channels and also places the stress test where the expertise is located with the component manufacturer. Many component manufacturers now perform a stress test as a standard processing step after their production tests and before shipping the component to the product manufacturer. However, since the stress test is still in the process, the cost and delay remains virtually the same. The component manufacturer's cost increases which are typically transferred to the product manufacturer's cost as prices the component manufacturer charges to the product manufacturer. The additional step of stress testing adds to the cost and fabrication time of the component. Up to now, package production test has not had the capability of finding as many reliability risks as stress testing finds, so this stress test is an additional test added to the production test of the component manufacturer.
One reason stress testing is costly and time consuming is that generally stress tests can last from 24 to 1,000 hours or more. Sometimes component stress testing is called burn-in. There are various burn-in methods. The oldest and simplest is static burn-in: prescreening test, elevated-temperature of component while the power supply pins are connected to a system controller so voltage can be applied, and post-test to eliminate failures due to the stress.
Next, dynamic burn-in was designed. The reason was to catch more bad parts than static burn-in and also increase the reliability of all parts. As products became more complex using more and more components, the level of reliability of a component which would pass a static burn-in would not be reliable enough. A more discriminating test like dynamic burn-in is required. The dynamic burn-in is similar to the static burn-in except during the elevated-temperature step along with applying voltage to the power supply pins, signals and voltages which exceed and stress the normal conditions are applied to the other pins of the component. One problem with the dynamic burn-in is that it is difficult to determine when a component fails. The burn-in must be interrupted and the components must be functionally tested in order to determine if there are any failures and which ones. Even testing at interrupt points only gives an approximation of the time of failure. Failure time is critical because it is used in constructing statistical models of reliability modes and rates like Weibull plots.
Another limtitation of the dynamic burn-in is the upper limit of test frequency. High frequencies generate spurious noise pulses. These noise pulses may invalidate applied signal levels. The high frequency input to one component may cause a spurious signal to another component. Actual test conditions are difficult to control at high frequencies.
A variation of the dynamic burn-in is to do a monitored burn-in. During dynamic burn-in, the inputs of the component under test (sometimes called device under test or DUT) are measured and compared to expected values in order to confirm the voltages and signals are at the pin and within specifications of the test. Sometimes this will be an indication of when a component fails, when an input cannot be held within range and the power supply to that pin limits out. However, even if the inputs can be held within predetermined limits, the component may not be functional.
Test during burn-in (TDBI) monitors the outputs and even performs a DC functional test during burn-in. This provides further information on when a component fails during burn-in. Doing a TDBI also eliminates the need to interrupt the stress conditions to perform periodic tests. This avoids the need of multiple ramping up and down of the environmental stress conditions. Since stress testing tends to be very long, anything which shortens the cycle time is useful.
Even with the evolution of stress testing from burn-in to TDBI, stress testing generally involves a batch process with a lengthy elevated-temperature time and operation of the component. Process times are reduced by reducing the number of times necessary to interrupt the stress test, increasing the batch size, increasing the number of components being monitored and tested, and refining life acceleration models so as to increase the stress and reduce time at stress.
A solution to further increase the reliability of the burn-in process was experimented with, in order to screen components in less than a minute per component test time, at the qualification test used at the product manufacturer site. This process is not a batch process, but a one component at a time process. This less than a minute screen test is a D.C. parameter and D.C. functional test for electronic components to reduce infant mortality rates and provide for a cost effective timely, but limited procedure to sort out bad components and pass good components. It comprises the steps of:
establishing a go/no go criteria using test limits established by trial and error from characterizing a particular group of components which fall within pass limits at normal operating temperature;
stabilizing the component at a temperature below test operating temperature and above normal operating temperature;
placing component in a test means for testing component;
heating the component package by an airflow and by conduction at a ramp rate of 100 degrees centigrade per minute to the test operating temperature;
performing multiple tests using normal operating voltages and putting bias voltages on the component for heating the component and to provide for test data measurements during the temperature ramp;
testing the component until the package temperature reaches a predetermined package thermal time constant away from temperature equilibrium;
ending the test after the component reaches thermal equlibrium at the test operating temperature;
determining whether the component is a good component using the go/no go criteria;
sorting the component so that a good component goes into a pass sort group and a bad component goes into a fail sort group.
This qualification test only provides a quick D.C. parameter and D.C. functional screen of components at high temperature.
Components which have low reliability due to A.C. parametrics, A.C. funtional, and high frequency effects may not be discovered with this test. This test would not be a substitute for the component manufacturer's package test. In order to sort out all the bad components, a complete package test needs to be done before this qualification test. This test is limited to the product manufacturer's site and is inadequate for a complete screen after component manufacture. Also, this test is inadequate for CMOS technology since D.C. does not heat the component enough to provide for sufficient induced stress. Since this test has broad acceptance limits, many reliability risks could be passed. A component's value may still be within the predetermined tolerance band, but the overall performance of the part with respect to the temperature curve may be abnormal, thus indicating a reliability problem.
Also, this test only used a temperature stress test. Because the activation energy in the oxide defect acceleration model is about 0.3, which is low, temperature does not accelerate oxide related defects very much. Temperature accelerates oxide defects about two to three times. This test is poor in screening oxide defects. Also, since this test was only an above normal operating temperature stress test, hot electron effects would not be observed. The high temperature "cures" hot electron effects.
This test was only a temperature ramp stress above normal operating temperature. The effects of only one ramp at high temperature are not representative of the component's thermal-mechanical-physical-electronic performance.
Burn-in has evolved and can provide much information on the reliability of the component, but through all the changes it has remained a screening to be done at the end of component manufacturing and at the beginning of product manufacturing. Burn-in has been an additional test to be performed on components which have passed prior tests. Burn-in has been so expensive and time-consuming that only ostensibly good parts are put into the process. At least two testing procedures have been required to avoid the cost and delay of processing as many bad parts as possible through burn-in.
The reason why stress testing has been limited to these two times has been time and cost. Generally, stress testing is long and costly, involving very specialized equipment to control and modify the surrounding environment along with special test and diagnostic equipment.
What is needed is a test method which can combine test, burn-in, and post burn-in test, thus eliminating multiple steps. However, a single process test must be able to have a high throughout so bad parts can be sorted economically at the same time potentially good parts are screened for infant mortality.