1. Technical Field
This disclosure generally relates to circuit testing. More specifically, this disclosure relates to circuitry for test stimulus decompression.
2. Related Art
Electronic design automation (EDA) is used by the semiconductor industry for virtually all integrated circuit (IC) design projects. More specifically, after developing a product idea, EDA tools are used to define a specific implementation. The implementation defined using the EDA tools is then used to create mask data, which is subsequently used for producing masks in the production of the finished chips, in a process referred to as “tape-out.” The physical masks are then created and used with fabrication equipment to manufacture IC wafers. Testing is typically performed on the IC wafers to identify defective wafers. Next, diagnosis is applied to the defective wafers to identify root-causes for systematic defects, wherein the identified root-causes are used for mask correction in order to improve manufacture yield. Finally, the wafers are diced, packaged and assembled to produce IC chips for distribution.
An IC design flow using EDA tools typically begins with an overall system design using architecture defining tools that describe the functionality of the product to be implemented by the IC. Next, logic design tools are applied to the overall system description to create a high-level description based on description languages such as Verilog or VHDL, and functional verification tools are applied on the high-level description in an iterative process to ensure that the high-level description accomplishes the design objectives. Next, synthesis and design-for-test tools are used to translate the high-level description to a netlist, optimize the netlist for target technology, and design and implement tests that permit checking of the finished chip against the netlist.
The typical design flow might next include a design planning stage, wherein an overall floor plan for the chip is constructed and analyzed to ensure that timing parameters for the netlist can be achieved at a high level. Next, the netlist may be rigorously checked for compliance with timing constraints and with the functional definitions defined at the high level using VHDL or Verilog. After an iterative process which settles on a netlist and maps the netlist to a cell library for the final design, a physical implementation tool is used for placement and routing. Specifically, the physical implementation tool includes a placement tool for positioning circuit elements on the layout, and a routing tool for defining interconnects for the circuit elements.
The components defined after placement and routing are typically analyzed at the transistor level using an extraction tool, and verified to ensure that the circuit function is achieved and timing constraints are met. The placement and routing process can be revisited as needed in an iterative manner. Next, the design is subjected to physical verification procedures, such as design rule checking (DRC), layout rule checking (LRC) and layout versus schematic (LVS) checking, that analyze manufacturability, electrical performance, lithographic parameters and circuit correctness.
After settling on an acceptable design by iteration through design and verification steps, such as those described above, the resulting design can be subjected to resolution enhancement techniques that provide geometric manipulations of the layout to improve manufacturability. Finally, the mask data is prepared and taped-out for use in producing finished products.
An IC generated from the above-described design flow typically includes circuitry that allows the finished product to be tested. Note that efficient testing of ICs often uses structured design for testability (DFT) techniques. In particular, these techniques may be based on the general concept of making all or some state variables (e.g., memory elements such as flip-flops and latches in the circuit) directly controllable and observable. One of the well-known DFT techniques is based on scan chains. This technique assumes that during testing all (or substantially all) memory elements are coupled together to form one or more shift registers. As a result, a logic circuit in an IC design can have two or more modes of operation, including a normal mode and a test (or scan) mode. In the normal mode, the memory elements perform their regular design functions. In the scan mode, the memory elements become scan cells that are coupled to form the one or more shift registers which are often referred to as “scan chains.” During the scan mode, these scan chains are used to shift the test stimulus into a circuit under test (CUT) and shift out test responses. More specifically, the scan mode involves applying a test pattern to the scan chains, which further includes scanning in the test stimulus, applying one or more functional clocks, and then scanning out the captured test response. The test responses are then compared with fault-free test responses to determine whether the CUT works properly.
Scan-based design techniques have been widely used to simplify testing and diagnose ICs. From the viewpoint of automatic test pattern generation (ATPG), a scan circuit can be treated as a combinational or partially combinational circuit. Currently, ATPG tools are capable of generating a complete set of test patterns based on different fault models, including stuck-at, transition, path delay, and bridging faults. Typically, when a particular fault in a CUT is targeted by an ATPG tool, only a small number of scan cells needs to be specified and one scan cell needs to be observed in order to detect this particular fault.
Note that in order to reduce test data volume and test application time, scan-based design techniques typically generate a compacted test stimulus and compacted test response rather than loading the entire test stimulus and unloading the entire test response. FIG. 1 presents a block diagram illustrating an IC 100 having an on-chip test compression capability. As is illustrated in FIG. 1, a tester 102 is coupled to IC 100 which comprises a CUT 104 which further includes a set of M scan chains, a decompressor 106, and a compressor 108. Decompressor 106 is configured to receive the compacted test stimulus from tester 102 and expand the compacted test stimulus to fill the M scan chains in CUT 104. Compressor 108 is configured to compress the test responses from the M scan chains and send the compacted test responses to tester 102.
FIG. 2 illustrates a number of conventional linear or nonlinear decompressor schemes. Generally, decompressor schemes can be classified as either combinational or sequential. A combinational decompressor, for example decompressor 202, comprises a combinational block 204 typically including XOR, NXOR, and MUX gates such that the loaded test stimuli of each scan chain are derived as a logic function of tester channels. This design scheme uses simple hardware and control logic. However, the drawback of this scheme is that combinational decompressors have to encode all specified care bits in the test stimulus in one shift cycle using only test data bits (or variables) supplied from the tester for this shift cycle (typically comprising one test data bit for each tester channel). This drawback can seriously limit the achievable compression ratio for the most highly specified shift cycles because the number of tester channels needs to be sufficiently large to encode the most highly specified shift cycles.
Sequential decompressors are based on linear finite state machines such as shift registers, linear feedback shift registers (LFSRs), cellular automata, or ring generators. For example decompressor 206 which comprises a shift register 208 and a combination block 210 is illustrated in FIG. 2. The sequential decompressors allow variables from earlier shift cycles to be used for encoding care bits in the current shift cycle. This property allows the sequential decompressors to provide much higher encoding flexibility than the combinational decompressors, and also helps to avoid the problem of the most highly specified shift cycles associated with the combinational decompressors. More recently, sequential linear decompressor designs often include a phase shifter placed between the scan chains and the LFSR or the ring generator to further improve encoding efficiency. One such example, decompressor 212 comprising a LFSR 214 and a phase shifter 216, is illustrated in FIG. 2.
Typically, a decompressor (either combinational or sequential) receives test data bits supplied by the tester represented by a set of variables {v0, v1, . . . , vn−1} and attempts to generate a test sequence C comprising a set of specified care bits {c0, c1, . . . , Cm−1}, which is also referred to as a “test cube.” This process is often referred to as “encoding” a test cube. A decompressor can generate the test cube C if and only if there exists a solution to a system of linear equations AV=C, wherein A is an n×m characteristic matrix specifying the decompressor, and V is the set of variables {v0, v1, . . . , vn−1}. (The characteristic matrix for a decompressor is typically derived by symbolic simulation of the decompressor such that each symbol represents one variable.) Hence, encoding a test cube using a decompressor requires solving a system of linear equations of the set of variables which is composed of one linear equation for each care bit. If no solution exists, then the test cube is considered “unencodable.” Note that it is difficult to encode a test cube that has more care bits than the number of available variables (or test data bits). However, if the number of variables is sufficiently larger than the number of care bits in the test cube, the probability of not being able to encode the test cube becomes negligibly small. For an LFSR with a primitive polynomial, if the number of variables is 20 more than the number of specified care bits, then the probability of not finding a solution (or an encoding conflict) is often less than 10−6.
On the other hand, the conventional sequential linear decompressor based on LFSRs or ring generators can imply very complex dependencies because each scan cell in the CUT can depend on the XOR of a large number of variables. Incorporating such complex dependencies in the ATPG implication process can greatly increase the computational complexity of the ATPG. For example, consider a scan cell whose state depends on q variables. In order to justify a particular state at this scan cell, q variables need to be assigned, and the number of possible ways to assign each variable with a value of 0 or 1 would be 2q−1. As q increases, this computational complexity grows exponentially. For this reason, the conventional sequential linear decompressors based on LFSRs or ring generators typically do not attempt to directly include the dependencies in the ATPG implication process. Because of this limitation, the conventional sequential linear decompressors do not fully utilize the degree of freedom in the ATPG.
Hence, it is desirable to design a decompressor which has the following properties: 1) a very high encoding efficiency; 2) a flexible mechanism to receive as many variables as needed; 3) a computationally efficient encoding process that can be directly incorporated into the ATPG implication process; and 4) an ability of the encoding process to extract as many as possible (or all) necessary state assignments due to dependency in the decompressor scheme.