In recent years the System-on-Chip (SoC) designs have evolved to include multiple cores to achieve several goals. This trend has been increasingly observed for graphics and gaming processors as well as conventional microprocessor designs. Multiple processor cores on a single chip provide performance benefits, additional multitasking capabilities, and dynamic repair opportunities which have the additional benefit of increased yield. They also present an opportunity to greatly reduce the test data volume and test time required to test the chip by using the embedded core testing methodology that completely isolates the cores via wrapper chains and then using a broadcast based Test Access Mechanism (TAM) to test all the cores in parallel. FIG. 1 illustrates an example SoC which has several identical cores, each having two internal chains. In this example, the test input stimuli (which is the same for each core since they are identical) can be fed into the chip using two chip level input pins 110 and then broadcast internally to all the cores.
On the output side, the test response from each core is expected to be the same if there is no defect since the cores are identical and isolated. The mask data indicating which channel and cycle should not be observed due to simulation unknowns and the expected test response data can be fed into the chip using chip level input pins 120 and 130, respectively. This data can then be internally broadcast to each core for masking (using the AND gates 160) and comparison (using the XOR gates 150) with CUT test response on chip. The comparison results from the same channel of each core can be logically ORed (using the OR gates 170) together to produce a pass/fail signal per core level channel brought out through two chip level output pins 140. These output pins 140 are observed during each shift cycle with a logic value “1” indicating a failure in a particular cycle and particular core level channel.
For this TAM, the number of chip level pins (and hence tester channels) is a constant that does not scale with the number of cores. Furthermore, the test data volume and test time are the same as that for a single core. Finally, since the test data is comprised of translated core level patterns, automatic test pattern generation (ATPG) need only be run at the core level.
While the test methodology illustrated in FIG. 1 is sufficient for pass/fail testing of chips, it presents a significant challenge for failure diagnosis. The shortcoming lies on the output side of the TAM. Observations on the output pins can be used to determine which cycle and which core level channel has a failure. It is, however, not possible to determine the failing core from this information. Furthermore since the cores are exercised with the exact same patterns, the same defect location will behave identically in every core. These two aspects together imply that diagnosis will not be able to distinguish between defects in different cores, thereby resulting in poor diagnosis resolution.
It would therefore be desirable to find TAM solutions that have one or more of the following properties: 1) the number of observation pins required does not scale with the number of cores and can practically be kept constant; 2) the hardware overhead is minimal; 3) the test-time overhead is also minimal with only a few additional shift cycles required at the end of the test session; 4) there is no addition to the test data volume; 5) the TAM is independent of core DFT architecture; and 6) the diagnostic resolution can easily be increased by adding more chip output pins if so desired.