One of the most important steps in the manufacturing of semiconductor devices is to place the device under test to make sure it is free of manufacturing defects and that it functions correctly. This usually involves hooking up a test instrument to the pins of the device under test (DUT) and stimulating the device with electrical signals (stimulus or test pattern) and capturing the response from the device via the pins. To streamline this process, batches of test patterns are automatically generated and applied while captured responses are automatically analyzed in an Automatic Test Equipment (ATE) platform.
As technology advances and semiconductor devices increase in size, speed and complexity, the demands placed upon ATE have also increased. The amount of data that has to be generated and analyzed during the course of testing has increased exponentially. ATEs have to accommodate ever increasing clock speed, as well as ever increasing number of asynchronous clock domains. Semiconductor devices also incorporate more and more blocks of diverse characteristics. These developments in technology make it impossible to test a semiconductor device in a simple uniform manner.
Few modern semiconductor devices with all of its diverse functionalities can be tested with one single instrument. The sensible approach is therefore to test a DUT with multiple instruments in an ATE. This type of ATE architecture comprises one user computer, a user bus and a plurality of instruments, such as described in Frish (U.S. Pat. No. 4,707,834). The communication between the plurality of instruments and the user computer is done through a communication board or a “master controller” on a test head.
No matter how many instruments or what types of instruments are employed, invariably this type of ATE architecture requires the user computer to (1) generate patterns for each instrument, (2) transfer the generated the pattern to each instrument, (3) poll each instrument for status such as test completion, (4) transfer the captured response from the DUT back to the user computer, then (5) process and analyze the captured response into useful data for test engineers.
The communication efficiency between the user computer and instruments is critical for an optimal total test time because the efficiency directly affects the time spent on transferring pattern data to instrument boards and reading back capture vectors from instruments. Especially when multiple DUTs are to be tested concurrently on a single ATE platform, the ‘parallel efficiency of test operation’ is decided by the communication efficiency and computing resources. Parallel efficiency is the total test time to test a single DUT divided by the total test time to test concurrently multiple DUTs by a single platform.
In order to support ATE of ever increasingly complex DUTs, instrument manufactures have made instruments with greater computing power and storage capacity. Unfortunately, this has only shifted the performance bottleneck to the user computer and the user bus. Regardless of the user bus structure, the user computer can only read from one instrument at a time; and if the instruments are not identical to each other in function, the user computer cannot broadcast but can only write to one instrument at a time. Furthermore, the user computer cannot multitask, no matter how powerful its CPU is. This is because a DUT is almost always a state of art device capable of crunching through data at speed comparable or even greater than the CPU on the use computer. A user computer therefore must generate pattern, process captured data, poll status and transfer data for each instrument in a sequential manner, even though logically many of these tasks need not wait for each other. This bottleneck results in a logjam at the user computer while many expensive instruments idle and wait. The ATE is thus unable to achieve parallel efficiency despite investment in multiple test instruments. Worse yet, if the ATE application is driven by a test program where the generation of test data to DUT depends on previous captured data from DUT, the compilation and the analysis must both be performed on the user computer, resulting in further degradation of efficiency.
The same bottleneck also limits the architecture's scalability. The communication board which routes all traffic between the instruments and the user computer limits the number of instruments that can be added to the system. This in turn limits the number of DUTs that can be simultaneously tested by the ATE. Even if one were to abandon the communication board and instead use a high throughput local area network (eg. Gigabit Ethernet) to connect the user computer with the instruments, the already mentioned performance limitation of the user computer CPU still limits the number of instruments that can be efficiently added to the system.
One approach to increase the parallel efficiency is to buy instruments with as much capability packed in as possible. The promise of this approach is that the increase computational and storage capacity in the instrument alleviate the performance load on the user computer and the user bus. Unfortunately this is not always feasible. Unlike an off-the-shelf computer, an instrument is an expensive capital investment which few companies can afford to constantly upgrade. And because a DUT is almost always a state of art device while an instrument is almost always a hardware built with yesterday's technology, the instrument can never keep up with the DUT. Relying on the best and newest instrument for parallel efficiency is thus ultimately futile.
Another approach is to use BIST (built in self test) or vector compression to reduce traffic on the host bus during ATE. BIST is suitable for highly structured circuit such as memory, but unsuitable elsewhere. Designing a BIST for a dense logic cloud is simply ineffective and impractical because it requires too much engineering effort. Worse yet BIST is a hardware solution requiring a built-in circuit within the DUT; one can't use BIST to promote parallel efficiency during ATE if no BIST circuit was designed in. BIST is thus not a realistic solution to solve the problem of parallel efficiency generally. Compression is an even less suitable solution, because not much bus traffic can be reduced if the generated data and the captured response traveling on the bus are incompressible highly random data.
Yet another approach is to arrange an array of instrument in SIMD (single instruction multiple device) or MIMD (multiple instruction multiple device) parallel configuration such as disclosed by Rockoff (U.S. Pat. No. 6,018,814). The promise of this approach is that instruments in the SIMD/MIMD array communicate with each other directly thus reduce traffic on the “global instruction network”. This is an expensive solution requiring instruments to possess the hardware to support the SIMD/MIMD topology. Each instrument in the SIMD must either be custom designed or custom modified from an existing instrument because the hardware to support SIMD/MIMD are not commercially available. To use or build such system requires much up-front engineering cost on custom hardware and software. As semiconductor technology advances, these customizations are often of questionable reusability as to make the initial investment unjustifiable.
Therefore there is still a need for an ATE platform built upon affordable, interchangeable, and easily reconfigurable machines whereas state of the art performance is readily available commercially, and where the topology of the system eliminates logjam at the user computer to achieve both parallel efficiency and scalability. Unlike instruments, user computers are affordable, interchangeable, and easily reconfigurable machines whereas state of the art performance is readily available commercially. It is thus advantageous to move as many computation tasks from instrument to user computers as possible.