The present invention relates generally to testing digital logic, and more particularly to a logical and consistent method and mechanism for using scan test techniques to test edge-triggered logic in a manner that resolves timing problems.
Recent advances in integrated circuit fabrication for digital systems has resulted in significant increases in the circuit density. Techniques for testing such high-density integrated circuitry have also advanced in order to be able to provide a credible assessment of the operability of an integrated circuit and systems incorporating integrated circuits. These techniques strive to produce a test methodology that can produce a reliable assessment in a minimum amount of time.
One such test technique enjoying relatively popular use today, sometimes termed “scan testing,” involves implanting a digital test state in the circuit under test and allowing it to operate normally before extracting and examining the resultant state. (“Circuit” is used herein to refer to combinations of logic as may be found on an integrated circuit, or on a printed circuit board, or the like.) To do this, the circuit to be tested is constructed to selectively operate in one of two modes: (1) a normal mode in which various flip-flops (“flops”) of the circuit (including those flops that are used to construct counters, registers, and the like) function to execute the design of the circuit, or (2) a test mode in which the flops, responsive to a test signal, are interconnected to form one or more long shift registers or “scan chains” for receiving a test pattern, which may be either a pseudo-random pattern or a known test pattern.
Pseudo-random scan testing typically involves placing the circuit in a test mode to form the one or more scan chains referred to above. Then, the scan chain is injected with a pseudo-random test pattern, and the circuit temporarily returned to its normal (non-test) state to allow it to execute at least one normal cycle. The circuit is then returned to the test mode, and the resultant state extracted and combined with other extracted states to form a “test signature.” The test signature is compared to a “golden” signature developed, e.g., from running a simulation of good circuit. The compare provides the GO/NO-GO indication of the operability of the circuit. Examples of this technique can be seen in U.S. Pat. Nos. 4,718,065, 5,694,452, and 6,029,263.
At times, a pseudo-random test will not adequately test some portion of the circuit under test. In this case, a data pattern may be crafted to exercise the untested logic.
Another form of scan testing, albeit less robust than full (and potentially at speed) pseudo-random scan tests, is that described by IEEE Standard 1149.1, promulgated by the Joint Test Action Group (JTAG), a collaborative organization comprised of major semiconductor users in Europe and North America. According to this Standard, the architecture will provide for tests that, among other things, can sample various inputs and outputs of the unit under test (external tests), as well as being able to test certain of the internal circuits of the unit under test (internal tests).
Now, it should be evident to those skilled in this art that a hallmark of scan testing is that the test be deterministic and repeatable. This means that a proper operating circuit will produce the same result when tested, regardless of the operating conditions, as long as those conditions are within predetermined parameters. These operating conditions include the manufacturing process variations of the circuit, the test voltage, and the test temperature, all of which influence the logic delay of the circuit under test. Without this feature of determinism, scan testing cannot be relied upon.
An integrated circuit typically includes a number of state machines, each having a number of flops, and each forming a clock domain that is normally asynchronously operated relative to the other click domains. In order to be able to test the entire integrated circuit at once (i.e., without resorting to sequentially testing the circuits within each clock domain), appropriate timing must be established at the interfaces between the clock domains to ensure deterministic operation for testing. Each clock domain will have its own clock distribution network, which meets functional insertion delay and skew requirements according to the application. Skew, in this context, is the difference in arrival times at clocked circuits of what is logically a single clock edge. For example, referring to FIG. 1, there is illustrated a representative integrated circuit, designated generally with the reference numeral 10, with three distinct clock domains 1, 2, and 3. Each clock domain includes at least one edge-triggered flop. Thus, clock domain 1 has at least the flop FF1, clock domain 2 includes at least the flops FF2, FF21, FF22, and clock domain 3 includes flop at least FF3. Each clock domain (1, 2, and 3) may also include combinatorial logic C (C1, C2, and C3, receptively). Data is received by each of the clock domains 1, 2, 3 at PORT 1, PORT 2, and PORT 3 inputs, respectively, while separate and different (and not necessarily synchronous) clocking signals are received at the CLOCK1, CLOCK 2, and CLOCK 3 inputs. Clock domain 2 (i.e., flop FF2) receives, as inputs on signal lines 12 and 14, data outputs from flops FF1 and FF3 of clock domains 1 and 3, respectively, as well as self-synchronous inputs. Flops FF21 and FF22 are employed between the asynchronous domains to bound the arrival time of signals from domains 1 and 3 into domain 2. The logic design of domain 2 must incorporate provision for the metastable behavior of these flops, if the behavior is to be reliable.
It should be noted that domains clocked from the same clock source, but with different clock gating terms, are at best quasi-synchronous. The arrival time of a common logical clock edge happens at unpredictable (but bounded) times in each domain. For example, a gated and a non-gated domain, operating from a common source, will usually have clock trees (circuits which replicate the clock signal(s) to achieve higher drive capability than the original circuit is capable of) with substantially different delays (unless the gating is done at a leaf node of a common distribution tree). This is an invitation to functional short-path problems (timing races) for signals crossing between the domains. This problem is exacerbated when multiple clock inputs, such as CLK1, . . . CLK3 of FIG. 1 are driven by a circuit tester (Tester) which introduces additional uncertainty.
Circuit 10 is controllable only from its primary input ports (PORT 1, PORT 2, PORT 3) and observable only from its primary output ports (not shown). Test coverage is likely to be poor because of low controllability and observability, unless a major investment is made in functional test vector development.
FIG. 2 illustrates modifications to the design of the integrated circuit 10 of FIG. 1 for scan testing. The elements shown in FIG. 2 that are also shown in FIG. 1 use the same reference numerals assigned in FIG. 1. As FIG. 2 shows, the modified circuit (designated with the reference numeral 10′) has added a scan data input (SDI), a scan data output (SDO) and two-input (1, 2) multiplexers M. The multiplexers M provide the circuit 10′ with a selectable scan path 20, shown in part by the dotted line, from SDI, through one input (1) of the various multiplexers M, flops FF1-FF3 and combinatorial logic C, to SDO. When this circuit is operated on the circuit tester, clocks CLK1, . . . , CLK3 are synchronously driven by the tester. A test (TEST) input receives a test signal that controls the selections made by the multiplexers M. When the test signal applied to the TEST input is in a first (non-test) state, the multiplexers will set the integrated circuit 10 in its normal, functional state. However, TEST input receives a test signal in a second (test) state, the multiplexers operate to reconfigure the data paths to form a scan path that incorporates the flops (FF1, FF2, FF21, FF22, and FF3, in that order) of the integrated circuit 10′, forming the scan chain to receive a test state vector that is shifted into the scan chain from SDI. The test signal is then switched to configure the logic to its functional topology, one or more periods of the clock signals CLK1, . . . , CLK3 applied, and the multiplexers again switched by the test signal to return to the scan configuration for removing the resulting state, which is shifted out (and the next test state shifted in).
Such scan testing requires that the circuitry of the individual clock domains 1,2,3, each of which will contain at least one state machine, operate as a single synchronous unit in order that the scan test results be predictable and repeatable. The functionally disjoint clock trees of each clock domain are fed from a common test clock source when in scan test mode. The interfaces between the clock domains, which were formerly asynchronous, may now have short-path problems which will cause unreliable/unrepeatable scan test results. This problem arises because, as is conventional, edge-triggered devices will accept and latch data applied to the input on one edge of the applied clock. Thus, it is not entirely certain, for example, whether the prior output of flop e.g. FF1 or the new output is transferred to flop e.g. FF21. The result depends on the order in which Clock 1 is received at FF1 relative to when Clock 2 is received at FF21. This order depends on the summation of delays from a common timing reference point somewhere in the tester to the respective flops. Some of the summation terms (hence the clock arrival times) are not knowable a priori. This condition is not indicative of unreliable functional operation, and is relatively easy to avoid in latch-based designs such as that used by IBM, which tend to avoid a lot of these problems by using a latch-based, two-phase, non-overlapped clock design discipline. A description of a latch-based system may be found in an IBM ASIC Product Applications Note entitled “ASIC Design Methodology Primer”, document number SA14-2314-00, 1998.
However, a latch-based design is not the technology of choice for some integrated circuit vendors or some high-speed applications. The vendor preference is often for designs incorporating edge-triggered devices (flops). Designs using edge-triggered devices traditionally require tedious and somewhat unreliable tuning of the clock distribution system if scan testing is to be used. Tuning (adjusting circuit path delays) is tedious because it cannot be done until the clock tree delays are all known. This does not happen until late in the design cycle and the tuning process adds time to the design release process. It is unreliable because it involves attempts to match the largest tree delay (with its large RC component) with a silicon delay, which delays will not track well over process, voltage, and temperature variations.
The model discussed (FIG. 2) is a bit oversimplified, in that it ignores the clock skew problem. This is the problem that the above-mentioned tuning operation attempts to address. The scan path 20 will be well-behaved only if clock skew is less than some maximum value, or if the effects of clock skew are designed out of the path. The clock skew encountered in a the scan test context can result from the tester itself by not delivering nominally simultaneous edges at the same instant. Or, the circuitry of the integrated circuit 10′ itself may not delay all clock signals by an equal amount, for any of a number of reasons. Typically, the sizes of typical clock domains can vary from a few dozen flops to many thousands. The clock trees necessary to support those different domain sizes inevitably have widely different delays.
Even within a single clock domain, there may be significant skew due to imbalances in the loading of the clock tree itself. This is true regardless of the attention to detail in the balancing due to across-die variations in propagation delay time for nominally identical circuits.
All this points to a need to be able to reliably and repeatably scan test digital circuitry that used edge-trigger devices incorporated in multiple clock domains.