The present invention relates to scan chain test architectures for integrated circuits, and in particular to optimization of the test architecture in dependence upon the circuit design.
Larger and more complex logic designs in integrated circuits (ICs) lead to demands for more sophisticated testing to ensure fault-free performance of those ICs. This testing can represent a significant portion of the design, manufacture, and service cost of ICs. In a simple model, testing of an IC can include applying multiple test patterns to the inputs of a circuit and monitoring its outputs to detect the occurrence of faults. Fault coverage indicates the efficacy of the test pattern in detecting each fault in a universe of potential faults. Thus, if a set of patterns is able to detect substantially every potential fault, then fault coverage approaching 100% has been achieved.
To facilitate better fault coverage and minimize test cost, DFT (design-for-test) has been used. In one DFT technique, structures in the logic design can be used. Specifically, a logic design implemented in the IC generally includes a plurality of state registers, e.g. sequential storage elements like flip-flops or latches. These state registers can be connected into scan chains of computed lengths, which vary based on the design. In one embodiment, all state registers in the design are scannable, i.e. each state register is in a scan chain. The state registers in the scan chains are typically called scan cells. In DFT, each scan chain includes a scan-input pin (also called a scan input herein) and a scan-output pin, which serve as a control and observation node during scan mode.
The scan chains are loaded with the test pattern by clocking in predetermined logic signals through the scan cells. Thus, if each scan chain includes 500 scan cells, then 500 clock cycles are used to complete the loading process. Note that, for simplicity, some embodiments described herein have scan chains of equal length. In actual embodiments, DFT attempts to create, but infrequently achieves, this goal. Thus, in actual embodiments, software can compensate for the different scan chain lengths, thereby ensuring that outputs from each test pattern are recognized and analyzed accordingly. This methodology is known to those skilled in the art and therefore is not explained in detail herein.
Typically, the more complex the design, the more flip-flops are included in the design. Unfortunately, with relatively few inputs and outputs of the design that can be used as terminals for the scan chains, the number of flip-flops per scan chain has increased dramatically. As a result, the time required to operate the scan chains, called herein the test application time, has dramatically increased.
FIG. 1 illustrates pertinent portions of a typical logic design for a sequential circuit. It includes combinational logic 110 and a number of state registers 112-0, 112-1, 112-2, and 112-3 (collectively 112). As used herein, the term “combinational logic” includes direct connections, so the logic paths through combinational logic 110 may include some that are mere wires, without any intervening alteration of the logic signals they carry. Only four state registers are shown in FIG. 1, but many designs have thousands or millions of state registers. A number of primary logic inputs PI0, PI1 and PI2 are provided to the combinational logic 110, as are a number of state register outputs Q0, Q1, Q2 and Q3. The outputs of combinational logic 110 include primary outputs PO0, PO1 and PO2, as well as next-state inputs D0, D1, D2 and D3 being provided to the state registers 112. While the illustration of FIG. 1 is not usually indicative of the physical positioning of components on an integrated circuit chip, all synchronous circuit designs can be drawn as shown.
The illustration of FIG. 1 also organizes the state registers 112 into two scan chains 114-0 and 114-1 (collectively 114). Scan chain 114-0 includes state registers 112-0 and 112-1, whereas scan chain 114-1 includes state registers 112-2 and 112-3. It can be seen that in scan chain 114-0, state register 112-0 has a separate scan input connected to a scan input SI0 of scan chain 114-0, and state register 112-1 has a separate scan input connected to the output Q0 of state register 112-0. The output Q1 of state register 112-1, in addition to being connected to combinational logic 110, is also provided to a scan output SO0 of scan chain 114-0. Similarly, it can be seen that in scan chain 114-1, state register 112-2 has a separate scan input connected to a scan input SI1 of scan chain 114-1, and state register 112-3 has a separate scan input connected to the output Q2 of state register 112-2. The output Q3 of state register 112-3, in addition to being connected to combinational logic 110, is also provided to a scan output SO1 of scan chain 114-1. Typically many more than two state registers are included in each scan chain, but for simplicity of illustration only two are shown in each scan chain in FIG. 1.
The device is designed to operate selectably in either of two modes, sometimes referred to herein as operating mode and scan mode. In the operating mode, the next-state data for the state registers 112 are taken from the outputs D0-D3 of combinational logic 110. In this mode the scan chains are inactive. In scan mode, the next-state data for state registers 112 are taken from the scan input of the respective state register.
FIG. 2 is another view of a portion of the design of FIG. 1, including two of the state registers 112-0 and 112-1, and portions of combinational logic 110. It can be seen that multiplexers 222 at the D input to flip-flops 223 within each of the state registers 112 are inserted to select between the respective D input from combinational logic 110 and the scan input from SI0 or from the previous state register 112 in the scan chain. The connections from register outputs of previous elements in the scan chain to the multiplexer 222 inputs, as well as the connections to scan input and output pins such as SI0 and SO0, are collectively designated 224 in FIG. 2. Using a scan_mode (i.e. a control) signal, multiplexers 222 can be configured to allow scan-in values to be shifted into flip-flops 223 without going through combinational logic 110. A pulse applied to the clock (CLK) terminals of flip-flops 223 will either capture values output from combinational logic 110, if the device is in operating mode, or will shift values from scan input SI0 into the scan chain, if the device is in scan mode. At the same time that it shifts values into the scan chain from scan input SI0, it also shifts values presently in the state registers 112, out via scan output SO0. Part of the process of “implementing” the scan chains involves replacing registers such as D flip-flops 223 in the circuit design, with register/multiplexer combinations like scan registers 112, and adding the scan chain interconnects 224.
FIG. 3 illustrates a standard flow 300 for processing a single scan test pattern for a particular device under test. In flow 300, step 301 sets the device in scan mode. Step 302 shifts the scan-in values into the active scan chains. Step 303 exits scan mode, returning the device to operating mode. Step 304 applies additional stimulus to the test circuit inputs PI0-PI2. As used herein, the stimulus includes both the values applied to the primary inputs PI0-PI2, as well as those shifted into the scan chains. The stimulus for a particular test iteration is also sometimes referred to herein as a test pattern or test vector. Step 305 pulses the clocks to capture the response of the device under test in the state registers 112. Step 306 sets the device again into scan mode, and step 307 shifts the scan-out values from the active scan chains. Step 308 again sets the device into operating mode. The response of the device to the test stimulus, and that is processed by external equipment in order to detect faults in the device under test, can include values that were scanned out in step 307 as well as values monitored on the primary outputs PO0-PO2.
Notably, steps 301, 303-306, and 308 take only one clock period on the tester. However, each shift operation, e.g. steps 302 and 307, take as many clock periods as the longest scan chain. In a complex design, upwards of a million flip-flops may be included. Assuming that only 10 scan chains can be provided, each scan chain would then have 100,000 (1,000,000/10) flip-flops, thereby requiring 100,000 clock cycles to process a single scan test pattern. Therefore, irrespective of any optimization achieved by overlapping scan operations of adjacent test patterns, test application time is dominated by the scan operation.
Deterministic automatic test pattern generation (ATPG) can be used to generate a set of test patterns for use in testing devices made according to a particular circuit design. ATPG operates generally by analyzing the circuit design and identifying a complete set of potential “faults”, and then attempting to generate a minimum set of test patterns needed to test for a maximum set of the potential faults. Ideally, fault coverage is close to 100%, but for a complex circuit design, this can require significant storage area in the test-application equipment for the large number of patterns to be applied as stimulus as well as for the expected response values for each test pattern. ATPG software often can combine testing for multiple faults using a single test pattern, but the number of test patterns required can still be very large.
Some conventional test architectures take advantage of the observation that to detect any particular fault, typically only a limited number of positions in the test pattern need be set. For typical test patterns, only 2% of the stimulus values are needed. For the remainder of the test pattern, the values applied make no difference to the process of detecting that fault. In notational shorthand when designing the test patterns, positions in the test pattern that play no part are referred to as “don't care” positions, and are often represented with a logic X rather than a 0 or a 1.
In certain newer test architectures, each device scan input is connected to a number of scan chain inputs. In a design having N scan chains and m device scan inputs, each scan-in value is provided to N/m scan chains. The shared scan-in values therefore allow for many shorter scan chains compared to conventional scan architectures. In this way, the state registers in the device could be organized into a much larger number of parallel scan chains, than the number of device inputs available for use as device scan inputs. With a 4-way share, for example, test time per device could be reduced by nearly a factor of 4.
When using such a scan architecture, however, test vectors must be chosen carefully to avoid conflicts. A conflict occurs when the test vector prescribes one value to be applied to a state register at one position in one of the scan chains, and the opposite value to be applied to the state register at the same position in a different one of the scan chains that share the same device scan input. Such a conflict often can be avoided by known methods such as by re-designing the test patterns (so that at each particular position of a scan chain, either a common value or a don't care appears in all the scan chains that share the same device input), or by changing the assignment of scan chains to device scan inputs, or by changing the sequence of state registers within a scan chain, or by changing the assignment of state registers to scan chains. But if none of these options are available, then either a different scan architecture is required or less than full fault coverage must be accepted.
In Kapur et. al. U.S. Pre-grant Patent Publication No. 2005/0268190 (“Kapur et. al.”), incorporated herein by reference, a technique is described in which the scan-in test architecture is dynamically reconfigurable as needed for a each shift of each test pattern. A “decompressor” is inserted between the device scan inputs and the scan chains, which is operable in a number of different modes for delivering device scan input values (or values derived therefrom) to the scan chains. For each shift of the scan chain within a test pattern, the tester sets the decompressor into the proper mode required for that particular position of the test vector.
Logic added to interface device scan inputs with the internal scan chains is referred to as a decompressor; this is because it takes only a few input values to supply a much larger set of receiving scan chains. Logic added to interface the internal scan chain outputs to the device scan outputs is referred to as a compressor, as it takes many values from the scan chains and funnels them to a much smaller set of device scan outputs. Sometimes test vectors produce a response in scan out positions which are unpredictable logic values. These unpredictable logic values could come from un-initialized memory elements, or from bus-contention or unpredictable timing related issues. These scan out positions will have an unknown value whether or not the relevant faults exist. These unknowns, which like the “don't cares” in the test vectors themselves, are sometimes notated as Logic-X's (unknowns). They can have a negative impact on the observability of good responses that are coming together in the compressor.
The X's generated during response capture can be proactively blocked from reaching the scan cell by identifying the X-sources and then removing them or by inserting additional DFT logic to fix the X-sources by adding additional test points. Another known way to block the Xs from reaching the scan cells is by careful test pattern generation where the don't-care bits in the scan-in vector can be set to control values to block the Xs from reaching the scan cell. In another solution, error masking and/or X-masking can be used. Error masking involves carefully designing the compressor so that multiple errors cancel each other out, and X-masking involves inserting masking logic between the scan chain outputs and the compressor in order to prevent an X from propagating to the compactor output. FIG. 4 shows an example of masking logic for a compressor that has redundancy in the XORs of the compressor. In this example, the masking logic ensures that within any group of scan chains that are observed the logic-X's in the response captured in any of the scan cells does not interfere in the observability of the scan cells in other scan chains. Masking can introduce its own problems, however, since it reduces observability in the design. The test pattern count therefore tends to increase for the same fault coverage, thereby partially countering the savings achieved by using an output compressor.
A large variety of solutions have been developed also for the interfacing logic on the input side of the scan chains. A survey of some of them is set forth in N. A. Touba, “Survey of Test Vector Compression Techniques”, IEEE Design and Test of Computers, July-August 2006, pp. 294-303, incorporated by reference herein. They include Code-based Schemes, Linear-decompressor-based schemes and Broadcast-scan-based schemes. These solutions can be categorized as either combinational or sequential. The combinational solutions can be as simple as direct (but shared) connections of device scan inputs to the internal scan chains, or as complex as decoding logic to unravel scan data to sequence of 1's and 0's. The more common solutions use XORs on the input or MUXes to distribute values from the device scan inputs to the receiving scan chains. The sequential solutions include some which are mutations of the Logic BIST structure tailored for scan compression. With seeds streaming in at intervals, or on every shift, the stimulus requirements for fault detection and observation (masking) are encoded to provide significant gains in test data volume and test application time. The sequential solutions also include those that use shift registers to temporarily store multiple values that get applied with various spreading logic.
Most modern integrated circuit design processes make use of Electronic Design Automation (EDA) tools. Various EDA vendors provide their own solutions by automatically (1) organizing the state registers of a provided circuit design into scan chains, and (2) inserting their own flavor of predetermined compression and decompression logic before and after the scan chains. The organization of state registers into scan chains and the compression and decompression logic, together with certain configuration settings provided to the ATPG system for the development of test vectors, are sometimes referred to collectively herein as the “test design” of a circuit design. EDA software allows the user some choices in this process, such as the allocation of device I/O pins between scan input and scan output functions, the number of modes to be implemented in the decompressor, whether output masking should be implemented, and so on. However, since users are typically ill-equipped to answer these questions in any meaningful way, users typically merely accept the software's default settings, which usually represent the optimal answer for some “average” circuit design. As a result, since no real-world circuit design is “average”, the test designs implemented for many circuit designs are sub-optimal, either in data volume or fault coverage or both.