1. Technical Field
The present invention relates in general to a field of computers, and in particular to movement of data in state holding elements. Still more particularly, the present invention relates to a method and system for moving scan data through a data buffer using a reduced number of latches.
2. Description of the Related Art
Computing processor logic is typically made up of multiple clusters of processing logic and data latches that manipulates data according to machine instructions executed by the processing logic, or self-directed logic such as a programmable logic array (PLA) or a field programmable gate array (FPGA). A typical collection of logic and latches is shown in FIG. 1 as logic/latch array 100.
Logic/latch array 100 is made up of multiple state holding elements 102 (typically latches) and logics 104. Data bits are input into the top state holding elements 102 where the data bits are latched, and at a subsequent clock cycle are loaded into one or more logics 104. The results of the operations of the logics 104 are then outputted to one or more state holding elements 102, and so on until the final results are outputted at the bottom of the logic/latch array 100. A chip is composed of many such blocks of logic and latches. A common desire when a chip is manufactured to test whether there were any defects in the manufacturing process that may cause function different from that which would result from defect free manufacturing. A test program of data bits, a set of test vectors, inputted into the top of logic/latch array 100 will output known predicted results, output or result vectors, from the bottom of the logic/latch array 100 after a known number of clock cycles if the logic/latch array 100 is working properly. For a given block of logic a prohibitively large number of vectors may be required to determine if the logic/latch block is suitably free from defects. This large number of vectors can result from logic that responds/changes only to a very specific set of inputs and is often called random resistant logic. One solution is to carefully choose the vector so as to get high coverage. Another solution is to independently check smaller portions of the function. This can be accomplished by setting the state of the latches, clocking the system, and reading the results from the latches. The subfunctions between the latches should be less random resistant and also easier to determine vectors that cover a given percentage of the faults. Checking such intermediate calculations utilizes techniques such as level-sensitive scan design (LSSD), generalized scan design (GSD) test techniques, or simple scan design test techniques that enable testing at all levels of VLSI circuit packaging. The principles of the LSSD technique are described, for example, in U.S. Pat. No. 3,783,254, No. 3,784,907 and No. 3,961,252, all to Eichelberger and incorporated in their entirety by reference.
FIG. 2a illustrates latch pairs 202, analogous to the state holding elements 102 shown in FIG. 1, that are used for scanning data out of a latch array 200 that holds intermediate results of operations performed by logics 104 as described above. (For purposes of clarity, note that FIG. 2a omits representations of logics 104 shown and described in FIG. 1.) To facilitate trustworthy scans, each latch pair 202 illustrated in FIG. 2a includes a master latch M202 and a slave latch S202. The slave latches S202 are necessary to ensure that data is not lost through timing mishaps that could occur if data bits were to be passed directly from a first master latch to a second master latch. During a scan-out process, a data bit in a first master latch is first scan/latched to a first slave latch, which then scans the data bit to a second master latch, which then passes the data bit to a second slave latch, and so on until the data bit safely scans (passes) through the entire latch array 200. As depicted in FIG. 2a, the latch array 200 of master latches M202 and slave latches S202 is under the clocking control of a first clock (A_clk) for the master latches M202 and a second clock (B_clk) for the slave latches S202. Thus, when a scan-out operation is performed, the data bits are scanned out in a serial serpentine manner as depicted, wherein the data bit in master latch M202-1 scans to slave latch S202-1, which scans the data bit to master latch M202-2, which scans the data bit to slave latch S202-2, and so on until the data bit is finally scanned out of latch array 200 through/from slave latch S202-x. 
Referring now to FIG. 2b, there is depicted a block diagram of four master/slave latch pairs being scanned out. Assume in FIG. 2b that instead of twenty master/slave latch pairs M202/S202, as depicted in FIG. 2a, there are only four master/slave latch pairs M202-1/S202-1 through M202-4/S202-4 in a First-In First Out (FIFO) 206, as depicted. At initial time “T1”, input queue 208 holds data elements “w, x, y, z,” each master latch M202 holds a significant data bit (such as a result of an intermediate operations performed by some piece of logic), each slave latch S202 is empty or in a “don't care” state, and the output queue 210 is empty (or in a “don't care state). At time “T2”, all the data bits are shifted into the available slave latches. Thus, data bit “A” scans from master latch M202-1 to slave latch S202-1, data bit “B” scans from master latch M202-2 to slave latch S202-2, data bit “C” scans from master latch M202-3 to slave latch S202-3, and data bit “D” scans from master latch M202-4 to slave latch S202-4.
Moving on to time “T3”, the data bits are shifted into the master latches either from slave latches or from the external queue. In addition a data bit will be shifted to the output queue. So, data bit “z” from input queue 208 shifts into master latch M202-1, data bit “A” scans from slave latch S202-1 into master latch M202-2, data bit “B” scans from slave latch S202-2 into master latch M202-3, data bit “C” scans from slave latch S202-3 into master latch M202-4, and data bit “D” scans from slave latch S203-4 into output queue 210. (Note that input queue 208 and output queue 210 may also have master/slave latch pairs (not shown) as depicted for FIFO 206.)
Continuing along the time line in FIG. 2b, significant data bits are continued to be scanned out of FIFO 206 until time “T9”, at which time all of the leading data bits (w, x, y, z) originally in input queue 208 are scanned into FIFO 206, and all of the significant data bits (A, B, C, D) are scanned out of FIFO 206 into output queue 210.
The main purpose of the slave latches S202 depicted in FIGS. 2a and 2b is to ensure that data is properly passed and scanned from master latch M202 to subsequent master latch M202 without being lost. However, as pipelines get finer, and the number of latches increases, the use of pulse latches and merged logic latches, which do not have an already available slave latch may become more common. Adding a dedicated slave latch to each master latch, if not already there, becomes very costly in terms of chip space and power consumption. Therefore, there is a need for a method and system of data scanning that do not require slave latches dedicated to each master latch in a logic/latch matrix