As integrated circuits are produced with greater and greater levels of circuit density, efficient testing schemes that guarantee very high fault coverage while minimizing test costs and chip area overhead have become essential. However, as the complexity of circuits continues to increase, high fault coverage of several types of fault models becomes more difficult to achieve with traditional testing paradigms. This difficulty arises for several reasons. First, larger integrated circuits have a very high and still increasing logic-to-pin ratio that creates a test data transfer bottleneck at the chip pins. Second, larger circuits require a prohibitively large volume of test data that must be then stored in external testing equipment. Third, applying the test data to a large circuit requires an increasingly long test application time. And fourth, present external testing equipment is unable to test such larger circuits at their speed of operation.
Integrated circuits are presently tested using a number of structured design for testability (DFT) techniques. These techniques rest on the general concept of making all or some state variables (memory elements such as flip-flops and latches) directly controllable and observable. If this can be arranged, a circuit can be treated, as far as testing of combinational faults is concerned, as a combinational or a nearly combinational network. The most-often used DFT methodology is based on scan chains. It assumes that during testing all (or almost all) memory elements are connected into one or more shift registers, as shown in U.S. Pat. No. 4,503,537. A circuit that has been designed for test has two modes of operation: a normal mode and a test, or scan, mode. In the normal mode, the memory elements perform their regular functions. In the scan mode, the memory elements become scan cells that are connected to form a number of shift registers called scan chains. These scan chains are used to shift a set of test patterns into the circuit and to shift out circuit, or test, responses to the test patterns. The test responses are then compared to fault-free responses to determine if the circuit under test (CUT) works properly.
Scan design methodology has gained widespread adoption by virtue of its simple automatic test pattern generation (ATPG) and silicon debugging capabilities. Today, ATPG software tools are so efficient that it is possible to generate test sets (a collection of test patterns) that guarantee almost complete fault coverage of several types of fault models including stuck-at, transition, path delay faults, and bridging faults. Typically, when a particular potential fault in a circuit is targeted by an ATPG tool, only a small number of scan cells, e.g., 2–5%, must be specified to detect the particular fault (deterministically specified cells). The remaining scan cells in the scan chains are filled with random binary values (randomly specified cells). This way the pattern is fully specified, more likely to detect some additional faults, and can be stored on a tester.
FIG. 1 is a block diagram of a conventional system 10 for testing digital circuits with scan chains. External automatic testing equipment (ATE), or tester, 12 applies a set of fully specified test patterns 14 one by one to a CUT 16 in scan mode via scan chains 18 within the circuit. The circuit is then run in normal mode using the test pattern as input, and the test response to the test pattern is stored in the scan chains. With the circuit again in scan mode, the response is then routed to the tester 12, which compares the response with a fault-free reference response 20, also one by one. For large circuits, this approach becomes infeasible because of large test set sizes and long test application times. It has been reported that the volume of test data can exceed one kilobit per single logic gate in a large design. The significant limitation of this approach is that it requires an expensive, memory-intensive tester and a long test time to test a complex circuit.
These limitations of time and storage can be overcome to some extent by adopting a built-in self-test (BIST) framework as shown in FIG. 2. In BIST, additional on-chip circuitry is included to generate test patterns, evaluate test responses, and control the test. For example, a pseudo-random pattern generator 21 is used to generate the test patterns, instead of having deterministic test patterns. Additionally, a multiple input signature register (MISR) 22 is used to generate and store a resulting signature from test responses. In conventional logic BIST, where pseudo-random patterns are used as test patterns, 95–96% coverage of stuck-at faults can be achieved provided that test points are employed to address random-pattern resistant faults. On average, one to two test points may be required for every 1000 gates. In BIST, all responses propagating to observable outputs and the signature register have to be known. Unknown values corrupt the signature and therefore must be bounded by additional test logic. Even though pseudo-random test patterns appear to cover a significant percentage of stuck-at faults, these patterns must be supplemented by deterministic patterns that target the remaining, random pattern resistant faults. Very often the tester memory required to store the supplemental patterns in BIST exceeds 50% of the memory required in the deterministic approach described above. Another limitation of BIST is that other types of faults, such as transition or path delay faults, are not handled efficiently by pseudo-random patterns. Because of the complexity of the circuits and the limitations inherent in BIST, it is extremely difficult, if not impossible, to provide a set of test patterns that fully covers hard-to-test faults.
The pseudo-random pattern generator typically is a simple hardware structure called linear feedback shift registers (LFSRs). An LFSR comprises a sequence of chained data memory elements forming a shift register. A given LFSR of length n can be represented by its characteristic polynomial hnxn+ . . . +h1x+h0, where the term hixi refers to the ith flip-flop of the register, such that, if hi=1, then there is a feedback tap taken from this flip-flop. When the proper tap connections are established in accordance with the given polynomial, the combined (added modulo 2) output of each stage is feedback to the first stage of the LFSR. Such an implementation is called type I LFSR or Fibonacci generator. An alternative implementation uses a shift register with XOR gates placed between the LFSR cells. It is called type II LFSR or Galois true divisor. A distinct feature of this configuration is that the output of the last stage of the LFSR is being fed back to those stages, which are indicated by the characteristic polynomial employed. A polynomial, which causes an n-bit LFSR to go through all possible 2n−1 nonzero states is called a primitive characteristic polynomial. A corresponding LFSR is often referred to as a maximum-length LFSR, while the resultant output sequence is termed a maximum-length sequence or m-sequence.
FIG. 3 shows an LFSR 24 used as a test generator to feed multiple scan chains 26 in parallel. A problem with this design is that the resultant fault coverage is often unsatisfactory due to structural dependencies introduced by the LFSR. Indeed, if the scan paths are fed directly from adjacent bits of the LFSR, then this very close proximity causes neighboring scan chains to contain test patterns that are highly correlated. This phenomenon can adversely affect fault coverage, as the patterns seen by the circuit under test (CUT) will not be pseudo-random.
To further reduce correlation between scan chains, a phase shifter 28 is inserted between the LFSR 24 and the scan chains 26 (See FIG. 4). A typical phase shifter 28 consists of exclusive- or (XOR) network employed to avoid shifted versions of the same data in various scan paths. Every scan chain is then driven by circuitry that corresponds to a linear combination of LFSR stage outputs. This circuitry generates a test sequence with the desired separation from other sequences by employing the “shift-and-add” property of m-sequences according to which the sum of any m-sequence and a cyclic shift of itself is another cyclic shift of this m-sequence. In practice, phaseshift circuits are designed according to different principles. In constant-phase shifters, an interchannel displacement for each scan path is specified prior to the actual synthesis process. The latter employs the transition matrix that describes the LFSR behavior to determine LFSR outputs realizing successive shifts. The basic deficiencies of this approach are necessity to perform matrix operations and complexity of the resultant phase shifter which may feature, even after a decomposition and factorization process, an unnecessarily large number of XOR gates, and large propagation delays. The large propagation delays are evident from the large number of XOR gates coupled to a single tap 30 on the LFSR. Such an excess loading increases capacitance resulting in slow signal propagation. Notably, other taps on the LFSR, such as tap 32, only has a single XOR gate coupled to its output. The discrepancy between loads on the taps of the LFSR increases linear dependency between the patterns stored in the scan chains.
In order to control the amount of hardware involved in the phase shifter design process, an alternative technique is given in U.S. Pat. No. 4,959,832. Rather than seeking linear combinations for each channel, it starts with a pre-specified phase shifter structure, and subsequently determines the resultant channel phaseshifts. Consequently, the separations between channels become variable, and complex calculations may be required to determine their actual values. In addition, the solutions presented in that patent limits the number of output channels to the number of LFSR stages. Unfortunately, the method used to design the variable-phase shifters is inherently ad hoc and becomes impractical for circuits with a large number of scan chains.
Recently, a new technique was presented in two papers: “Design of phase shifters for BIST applications,” Proc. VLSI Test Symposium, 1998, and “Automated Synthesis of Large Phase Shifters for Built-in Self-Test,” Proc. ITC, 1998. These papers disclose a concept of LFSR duality. In LFSR duality, given a type I LFSR, its dual LFSR (that is always of type II) can be obtained by reversing the direction of all feedback taps except the rightmost one. Similarly, given an LFSR of type II, a dual LFSR of type I can be derived by reversing all the feedback taps except the rightmost one. This method relates the logical states of dual LFSRs and architecture of a desired phase shifter as follows. After an appropriate initialization of the dual LFSR, its logic simulation is performed for k consecutive steps. Subsequently, the resulting content of the dual LFSR, i.e., the locations of 1s, point out positions that should be included in a phase shifter structure to obtain a sequence shifted by k bits. It is shown that it is possible to synthesize in a time-efficient manner very large and fast phase shifters with guaranteed minimum phaseshifts between scan chains, and very low delay and area of virtually one 2-way XOR gate per output channel. Unfortunately, the techniques described in these papers also have problems with load balancing. Specifically, a discrepancy exists between loads on the LFSR taps that increases propagation delays and the linear dependency between the patterns stored in the scan chains.
The continuous trend toward higher integration densities and more flexible BIST automation tools creates an increasing need for more time-efficient phase shifter synthesis procedures and corresponding very fast logic synthesis routines. These techniques should be able to handle a wide variety of large LFSRs feeding a large number of scan chains, and, at the same time, provide a cost effective implementation of a given phase shifter network. It is not uncommon for the current designs to contain in excess of one million gates. The number of flip-flops in such designs ranges in tens of thousands. Assuming that there are about 50,000 flip-flops in a million gate design and limiting the number of flip-flops per scan chain to 250 in order to reduce the test application time, one can obtain a circuit with 200 scan chains. A 64-bit wide LFSR would be sufficient to drive these 200 scan chains only if a carefully designed phase shifter were employed to remove structural and linear dependencies. In order to ensure high fault coverage and reasonable test application time, it is imperative to eliminate such dependencies between the scan chains. From this example, it is clear that proper design of phase shifter circuits plays a key role in the success of a pseudo-random BIST scheme.