1. Technical Field
The present application relates generally to an improved data processing system and method. More specifically, the present application is directed to a system and method for reducing test time for loading and executing an architecture verification program for a system-on-a-chip (SoC).
2. Description of Related Art
When a microprocessor or system-on-a-chip (SoC) is designed, it is important that the resulting semiconductor integrated circuit chip (IC chip) be tested to ensure proper functioning. Testing the IC chip requires applying test patterns to the IC chip and examining the results of the chip logic operating on those patterns. To introduce patterns to logic fed by memory elements (e.g., latches or flip-flops), scan techniques (such as Level-Sensitive Scan Design (LSSD), Boundary scan, etc.) are often used, wherein the memory elements on the chip are connected to each other in one or more scan chains, such that test patterns can be loaded in via the scan chain and applied to the logic under test. Similarly, this scan chain can be used to read out logical results of logic feeding memory elements.
As the scale of semiconductor integrated circuit integration keeps increasing, devising testing methodologies and circuits for testing the IC chips becomes more and more challenging. A presently widely-used methodology for testing IC chips is the level-sensitive scan design (LSSD) methodology that utilizes boundary scan shift register latches (SRLs) to scan test data into the circuitry under test and scan the output of the circuitry. The scanned output is then compared to a set of expected data outputs to determine whether or not the circuitry is functioning properly.
FIG. 1 illustrates a conventional LSSD methodology 20 that utilizes a scan chain 24 of the SRLs 28 and three LSSD-dedicated clock trees, an A-clock tree 32, a B-clock tree 36, and a C-clock tree 40, for scanning test data into combinational logic or other circuitry (not shown). Each SRL 28 generally includes a master latch 44 and a slave latch 48. Each master latch 44 can be, for example, a two port latch having one data port D1 and one scan-in port SI. Conventionally, C-clock tree 40 is for a C-clock (not shown), or data clock, that activates data ports D1, A-clock tree 32 is for an A-clock (not shown), or shift clock, that activates scan-in ports SI of the master latches 44, and B-clock tree 36 is for a B-clock (not shown), or slave latch clock, that activates slave latches 48 after master latches 44 have latched the corresponding shift values.
During LSSD testing, the A-clock and B-clock are non-overlapping and enable the proper shifting of scan data into master latch 44 of each SRL 28 and out of data output port DO of each slave latch 48. During the test's system cycle phase, the B-clock launches the test data from slave latch 48. A subsequent C clock pulse captures the test response in all of SRLs 28.
In addition to LSSD clock trees 32, 36, and 40, a functional clock tree 52 is present for providing SRLs 28 with a clock for functional operation, as opposed to test operation, of the SRLs. Clock trees 36, 40, and 52, are typically connected to SRLs via one or more clock splitters 56.
Such scan chains may be used, with CPU based SoCs, to accomplish loading of array data of architectural verification programs (AVPs) into a cache memory of the IC chip when the chip is on a tester. The CPU instructions of the AVP may then be executed from the on-chip cache memory. The pass/fail criteria of the AVP determines if the chip is good or bad.
A significant amount of tester time is consumed in the loading of the AVP into the cache memory and unloading of the pass/fail data from the cache memory. The number of clock cycles that the actual AVP executes is extremely small in comparison to the number of clock cycles needed to load and unload the cache memory. This is because the AVP data must be scanned into the cache memory array and programmers typically do not concern themselves with the order in which such AVP data is scanned into the cache memory array or the scan chains used to scan in this AVP data. Because each memory location of the cache memory that needs to be modified to load the AVP program is loaded by a complete scan of the scan chain, a large number of clock cycles may be needed to scan in the AVP data into the cache memory.
For example, assume that the scan chain has a length L. Also assume that there are M number of memory array locations that need to be modified to load the AVP program. Conventionally, the total number of scan clocks to load the AVP into M locations of the cache memory would be M*L. If M and L are typical for an AVP program, then M is approximately 1200 and L is approximately 3000. As a result, the total number of clock cycles required to scan-in the AVP program is approximately 3,600,000 cycles. A similar large number of cycles may be required to unload pass/fail data from the cache memory for similar reasons, i.e. each memory location in the cache memory that contains the pass/fail data must be individually unloaded with a full scan of the scan chain.
The amount of time an IC chip spends on the tester directly contributes to the dollar amount spent on development of the IC chip and adds to the end cost of the chip. Thus, any reduction in the amount of time an IC chip spends on the tester provides a significant cost savings with regard to the development and production of the IC chips.
These problems with loading/unloading of the cache memory are made even more severe when there are multiple memory arrays on the chip that need to be loaded with an AVP, such as in multicore IC chips. Thus, it would be beneficial to reduce the number of clock cycles required to load/unload AVP data in cache memory arrays and thereby reduce the amount of time a chip spends on the tester.