As technology nodes shrink, more defects are found in the scan chain circuitry added for the purpose of test. This is due to a number of reasons as described in this document. To bring new integrated circuits to market, and ramp yield to acceptable levels, identifying these defects and learning trends is critical, but can be costly without new approaches.
Scan Basics
Scan Chains are critical needed items for test and yield-bring-up; when they break they both relegate the chip under test to the fail bin (adding to the yield loss problem) and they may mask and prevent timely and accurate evaluation of other fails sourcing from combinational, sequential, power-distribution or clock logic. This criticality requires that it is first necessary to provide some background on established techniques of SCAN in semiconductor test. The approach of scan methodology is to replace all flip-flops in a design with scan flip-flops. Scan flip-flops provide two paths into each flip-flop: one for the mission of the design, and a second to facilitate test.
Scan Flip-Flops
There are two most common methods of implementation today:
MUXD—This scan flip-flop approach places a multiplexer (mux) on the front end of the D-input. The selector to the mux, known as the scan enable, determines whether to use the mission mode input or the scan test input.
LSSD—Another common scan flip-flop approach is to use multiple non-overlapping clocks: one pair operates the separated Master and Slave latches for mission data; the other pair operates the separated Master and Slave latches to produce the scan shift operation. The total scan shift and sample operation may be conducted with just one pair of clocks or with a combination of all of the clocks.
Scan Chains—By stitching all of the scan flip-flops, or scan cells, together into one or more shift registers called scan chains, each flip-flop can be preset or observed. This allows for test patterns to be constructed that will concentrate on finding faults in mini sub-circuits.
Further descriptive commentary will focus on the MUXD type scan since it is easier and simpler to describe. Each scan flip-flop has two input paths as controlled by a mux on the input. When the scan enable “SE” is asserted, the scan chain operates as a shift register. This allows for each flip-flop to be set to a specific state. It also allows for the observation of each flip-flop state as the values are shifted out of the device onto the scan output “SO”. Level Sensitive Scan Design is disclosed by Eichelberger et al. in 14th Design Automation Conference Proceedings June 1977, pp 492-494 and in U.S. Pat. Nos. 3,783,254 and 4,293,919.
Current Defect Models for Scan Chains: Blocked, Bridging, Hold-Time.
Defects in scan chains are becoming more common as technology nodes shrink and as the number of flip-flops per design increase. Problems often result in scan chains as scan interconnects are routed later to avoid interfering with the mission critical routing.
There are several generally accepted models for defects in scan chains: blocked chains, bridging, and hold time.
Blocked Chains—This condition is determined by observing the scan outputs while in scan mode. If the output is at a fixed level regardless of the data shifted into the chain, the chain is blocked at one or more points, the block nearest to the scan data output dominates what is observed from that chain. The fault model is generally that the output of the scan chain is either stuck-at-0 or stuck-at-1 from the sequential element located at the point of the break.
Bridging—Bridging faults are a condition of data dependency when data passing through one scan chain can modify data in another scan chain or in a different location in the same scan chain. The suspected mechanism is an “aggressor-victim” short or bridge that is exercised when the two signals involved are at opposite values.
Hold-Time—Hold-Time faults are a condition that allows the data from one flip-flop to race forward in the chain. Hold time faults are attributed to a number of factors including long wire routes as compared to Clock to Q times of flip-flops or clock skew. This condition is suspected when data produced on the output is still toggling but seems skewed (correct response but shifted over in time) or that bits are missing (data smearing or bit skipping). On the overall, Hold-Time Violations can be viewed as “accidental encryption”. If the number of bits applied into the scan input does not match the data on the scan output, it is likely that a hold-time problem exists. In some cases hold time violations make the scan chain appear to have fewer flip flops than it actually has.
Hold time is a data communication fault between two adjacent cells in a scan chain the bit closest to scan in is the aggressor cell and the bit closest to scan out is the victim cell. When a hold time violation exists the victim cells data is replaced with the aggressor cell's data value. The resultant data stream shows the aggressor's data twice and the victim's data is lost. There are three common types of hold time.    A) Standard hold time—both data states are improperly communicated, resulting in a fail signature that simulates a missing flip flop in the chain.    B) Data One hold time—When the aggressor cells data is a one, its data is pulled forward one location overwriting the victim's data. The resultant fail signature has too many ones.    C) Data Zero hold time—When the aggressor cells data is a zero, its data is pulled forward one location overwriting the victim's data. The resultant fail signature has too many Zeros.    Testing the Scan Chains—Typically, to insure that the scan chain test logic is operational, tests will be performed on it prior to the functional logic (Scan Chain Integrity Tests). The most common approach is to send a series of 1's and 0's at the Scan Inputs (SI). With the Scan Enable (SE) asserted, the scan chain is essentially a big shift register. With the continued assertion of the Scan Enable (SE), the functional logic is removed from the test. After ‘n’ number of clock cycles, where ‘n’ equals the number of scan cells in the chain, the input stream should be observed on the Scan Output (SO).
The Problem: Data Transfer in the Scan Chain (Hold Time)
If the scan chain does not transfer data from the input to the output reliably, the entire scan methodology is lost. Input data to load the chain and output data to unload the chain are both disrupted. This typically manifests itself as a scan chain integrity failure. This makes the scan chain appear to be shorter than it actually is by at least one flop. The clock skew issue can be caused by design issues such as timing closure or manufacturing defects such as faulty vias or weak clock-tree buffers. In nanometer geometries, it is often caused by a combination of the two causing a yield loss due to hold time issues.
Conventional chain integrity patterns produced by ATPG tools today are implemented as a replicated stream of a ‘0-0-1-1-0-0-1-1’ sequence. This sequence has data changing on every other vector. Therefore a device with a standard hold time violation appears to be shifted by one bit and the last bit is indeterminate. The last bit that is shifted out is not part of the pattern; it is the state that was on the scan in pin when the scan out sequence was applied.
Note that the fail signature with a single standard hold time violation is basically pass fail pass fail.
Often these failures are timing sensitive. Since the rise time and fall time of the Q outputs are not symmetrical, these failures may result in the ability to transfer one data state but not the other. This failure could be caused by the Q-output having a slower to rise time than a fall time.

The examples above assume a single hold time violation with 0 being the aggressor state and 1 being the victim state. When devices have multiple hold time violations on a single chain the data shifts one cycle for each violation. It is quite common for smaller geometry devices to have multiple hold time violations on a single chain. With the standard 00110011 pattern it is possible for a chain to shift the data 4 positions and at the end of the pattern actually be passing the scan chain test. More exhaustive patterns with a consistent background pattern are required to diagnose and localize hold time failure mechanisms. ATPG tools alone may not be able to localize these problems because they are dependent upon integrity in the scan chains to perform diagnosis.
Functional Tester Background—Historically, testers apply a set of pre-determined and simulated stimulus, and validate that the response on the device outputs match the results expected from the simulation. Functional testers are designed to report in a go/no-go fashion that all of the outputs matched the expected results for all checked strobe points or not. Functional testers are not architected to understand design criteria of the device under test such as the scan structures. Thus, while testers can understand which output signals contained failures, each output signal can represent tens of thousands of internal scan cells.
Voltage Sensitivity of hold time—In some cases changing the core voltage on the device can change the internal timings enough to create or eliminate a hold time issue. Experimentation by the inventors has shown that raising the voltage will often allow the scan chain to pass. In this document the voltage that makes a scan chain pass is referred to as the safe voltage. The targeted voltage is referred to as the spec voltage. Temperature can also affect the internal timings and contribute to the safe operation.
A Problem with Scan Chains:
In design for test methodology, flip flops or registers have a dual functionality. During normal or functional mode, they latch data states in the circuit and store values to be transmitted to the next cloud of logic in the design. During the test mode, the registers are used to provide test stimulus to the combinational logic, and capture the results of the logic operation. To transfer the test patterns into and out of the device under test, the registers are reconfigured as several serial shift registers.
A design problem known as setup violations can occur if the amount of logic between two banks of registers is so great that the data does not propagate through the logic and become stable at the input to a register with the sufficient setup time required before the register is clocked. The result clocked into the register may be invalid. This is solved by design methodologies and tools associated with the term timing analysis.
After the desired state is loaded into the scan chains, scan enable is not asserted and the logic is clocked one or more times in mission mode. The result of the logic operation is captured in the flip flops. It is desirable to bring this result out of the device under test so they can be examined by the tester. After putting the registers back into serial shift mode, enough clocks are applied to shift every bit in the scan chain out of the scan out port. Most designs have some scan chains longer than others. It is important to shift the data enough times for even the longest scan chain to be fully unloaded. Shorter chains are over shifted and therefore get padded with X (don't care) states. This same technique is applied when data is shifted in where shorter chains are typically pre-padded with dummy 0 data before the actual data stream.
At this time it is common to shift a new test pattern in through the scan in port.
Several defects can frustrate this operation and must be detected, analyzed, and reported. If the scan path is blocked at some point, not all the test pattern will reach its intended registers to stimulate the logic, and of the data that is captured into registers, not all of the output pattern will be emitted from the scan out port.
In conventional implementations of scan registers, charge flow or current is required to establish a clock event as well as state change. A defect in the conductive medium may lower the rate of current so that a state change or a clock event may be delayed from its desired time. Furthermore clock signals must be distributed throughout a chip and require buffers to boost current. Any two clock signals may have a difference in arrival at their register which is called clock skew and which can be managed by adjusting the buffer size and the routing of the wires carrying the clock signal.
Hold time faults model some defect that causes a clock to be delayed to the point that the register latches the same value as the register which precedes it. In some cases the input is transitioning to a new state in which case the data value captured is invalid. In the experience of those skilled in the art of testing it has been observed that a change of state from one to zero or zero to one is more likely to be involved in a hold time fault but with unequal likelihood.
As an example consider a scan chain of length 8 with bit zero closest to the scan out port and bit 7 closest to the scan in port.
After a functional clock, the state of the logic is entirely captured within the 8 registers
V(0) V(1) V(2) V(3) V(4) V(5) V(6) V(7)
In a correct shift register, each of the 8 bits is serially shifted out and a new test pattern beginning with I(0) is shifted in.
But imagine that one of the flip flops exhibits a hold time defect:
On the first shift clock V(0) is emitted from the scan out port.
A defect on bit 6 in the chain causes it to capture V(7) rather than V(6).
Continuing this shift the last flop might capture the state of the preceding flop.
At the end of shifting 8 clocks the scan out will have received
V(0) V(1) V(2) V(3) V(4) V(5) V(7) I(0)
Meanwhile the input test pattern will also have been corrupted:
I(0) I(1) I(2) I(3) I(4) I(5) I(7) I(N)
It may be appreciated that scan chains may be conventionally comprised of 10,000 registers and that even if less than 1 percent exhibit a hold time fault it may be hundreds of bits that have been invalidated. It can be further appreciated that hold time faults may operate statistically rather than ideally and that the probability of a fault affecting a zero-one transition may differ from the probability of a fault affecting a one-zero transition.
To an observer at the scan out port it may appear that the scan chain has been shortened by one bit for each defect and that one bit is simply missing.
Thus it can be appreciated that what is needed is a method to determine if a scan register exhibits hold time defect behavior, determine the number of potential hold time defects in a scan registers, and if possible locate the hold time defects within a scan register.