It has become very difficult to diagnose failures in and to measure the performance of state-of-the-art very large scale integration (VLSI) chips. This is because modern VLSI chips often not only run at very high clock speeds, but many of them also execute instructions in parallel, out of program order and speculatively. Moreover, visibility of the VLSI chip's inner state has become increasingly limited due to the complexity of the VLSI chips and to practical constraints on the number of external pads that can be provided on the chip package.
In the past, the traditional failure diagnosis and performance measurement tools have been external logic analyzers and in-circuit emulators. Logic analyzers are capable of monitoring signals on the chip pads and other externally-accessible system signals, capturing the state of these signals and generating triggers based on their states. Unfortunately, logic analyzers must rely solely on externally-accessible signals to accomplish this, not on signals that are internal to the chip itself. In-circuit emulators, on the other hand, are used to mimic the functional characteristics of a new VLSI chip in a system environment and to add visibility to certain data values within the VLSI chip. But such devices only emulate the functionality of the VLSI chip. By their very nature, they cannot give an accurate representation of the performance characteristics of an actual silicon device. Therefore, they are primarily useful only for developing and debugging system software.
Thus, as an alternative or supplement to system emulation, confirmation of operation of an integrated circuit, such as a microprocessor, application specific integrated circuit (ASIC) or similar device, is accomplished using the actual, fabricated device, i.e., the device as produced in “silicon.” By applying test signals to the actual device and monitoring its operation, a developer or manufacturer can confirm both logic and electrical functions. Likewise, any problems identified must be debugged and remedied. Visibility inside a chip therefore becomes of paramount importance to address, debug, and correct functional, logical and/or electrical problems.
A certain level of visibility within the chip is provided by external interfaces of the chip. External interfaces can come in several different types. Debug information can be fed out of the chip on the bus interface on unused cycles, or in unused fields on a given cycle. Additionally, prior solutions have provided dedicated pins on the chip to act as a debug port to give visibility into the chip. The pin can be directly attached to a point within the chip for a reading or the pin can provide the ability to mux out important internal information on the dedicated pin. When internal chip information is accessed from either the bus interface or dedicated pin, a logic analyzer or other monitoring device is required to access the information. The use of a logic analyzer or other monitoring device brings with it several disadvantages. First, extra pins dedicated to debug functionality are required in the chip package. These extra pins increase overall costs, and the use of these pins for debug functionality compete with chip functionality resulting in less functionality included within the chip. Secondly, designing and verifying the software to be used with the logic analyzer or other monitoring device to read and interpret the information obtained from the chip is very costly. Finally, attaching the logic analyzer or other monitoring device to the chip to acquire the information competes directly with the proper placement of the chip within the computer system. These difficulties result in longer chip debug schedules.
The number of pins dedicated to debug operations may be reduced by allowing the logic analyzer or other monitoring device to inform the chip as to what debug information is of interest. This can be accomplished by providing elaborate trigger mechanisms to reside within the chip which collapse down to a single trigger out signal to the external logic analyzer. In addition to reducing the number of dedicated pins to debug operations this solution may also provide an earlier insight into the chip logic by allowing the logic analyzer to inform the chip of the area of interest. While this solution does reduce the number of pins dedicated to debug operations, several debug port pins are still required in the design and verification of the logic analyzer software is still costly. Additionally, the trigger solution limits the internal trigger to a specific set of internal source nodes of those that have been predefined and implemented.
Another method of providing information for debugging operations consists of the use of shadow registers. Shadow registers allow an internal trigger or an external trigger fed into the chip to permit the capture of a limited set of information into the shadow registers. The information contained within the shadow registers then can be accessed through the IEEE 1149.1 port (or other scanport or software port) without impacting the normal operations of the chip. Shadow registers are implemented within the chip with the addition of shadow flops. The location of the shadow flops must be predetermined during the chip design phase. Area constraints limit the number of shadow flops that can be placed in the design and the shadow flops tend to be expensive. So debug operations through the use of shadow registers is limited by the number of shadow flops that can be included in the design and by the difficulty in selecting the most likely places for the shadow flops to be required to provide visibility into the functionality of the chip. This problem is further exasperated if multiple cycles of information are required for a node or a bus in order to perform debug operations. The storing of data from multiple registers would require additional flops to be included in the circuit design. However, if a problem were repeatable, multiple cycles of information could be obtained from the chip by successive iterations in which debug information for sequential cycles is obtained. Data obtained in this manner can be interpreted as a virtual logic analyzer trace of the shadow flop locations. While all shadow flops within the chip can be viewed in this manner, if the problem being debugged is not repeatable this process cannot be used.
Alternatively, a trigger can be used to halt the clock of the chip and thereby “freeze” all of the information within the chip components. Now all the flops on the normal internal scan chain can be scanned out and the debug information acquired in this manner. Typically a large number of scanable flops are included in the chip design and this inclusion results in only a small increase in the area required. However, in order to perform debug operations in this manner, the clock must be repeatedly halted which interrupts normal operation of the chip during the debug operation. Additionally, repeatability of the system is still required to debug problems which involve multiple cycles of information.
Prior methods and devices have attempted to address testing of very large scale integrated (VLSI) circuits by incorporating testing circuitry into the chip. For example, the IEEE 1149.1 standard specifies a four or five wire serial test bus requiring one pin each for test data in, test data out, test mode select, a test pulse or clock signal and an optional test reset. Because it is serial, this interface is typically limited to providing one bit of test data out for every clock cycle of the unit under test. Thus, capture of test information is limited.
Another system is described in U.S. Pat. No. 5,867,644, issued Feb. 2, 1999, to Ranson, et al., incorporated herein in its entirety by reference, and discloses a user-configurable diagnostic hardware contained on-chip with a microprocessor for debugging and monitoring the performance of the microprocessor. A programmable state machine is coupled to on-chip and off-chip input sources. The state machine may be programmed to took for signal patterns presented by the input sources, and to respond to the occurrence of a defined pattern (or sequence of defined patterns) by driving certain control information onto a state machine output bus. On-chip devices coupled to the output bus take user-definable actions as dictated by the bus. The input sources include user-configurable comparators located within the functional blocks of the microprocessor. The comparators are coupled to storage elements within the microprocessor, and are configured to monitor nodes to determine whether the state of the nodes matches the data contained in the storage elements. By changing data in the storage elements, the programmer may change the information against which the state of the nodes is compared and also the method by which the comparison is made. The output devices include counters having outputs that may be used as state machine inputs, so one event may be defined as a function of a different event having occurred a certain number of times. The output devices also include circuitry for generating internal and external triggers. User-configurable multiplexer circuitry may be used to route user-selectable signals from within the microprocessor to the chip's output pads, and to select various internal signals to be used as state machine inputs.
Another solution to chip testing is presented in U.S. Pat. No. 6,003,107, issued Dec. 14, 1999, to Ranson, et al and incorporated herein in its entirety by reference. This patent describes circuitry for providing external access to signals that are internal to an integrated circuit chip package. The circuitry includes a N:1 multiplexers distributed throughout the integrated circuit die. Each of the multiplexers has its N inputs coupled to a nearby set of N nodes within the integrated circuit, and each of the multiplexers is coupled to a source of select information operable to select one node from the set of N nodes for external access. The multiplexers outputs are coupled to an externally-accessible chip pad. The integrated circuit is typically a microprocessor, and the source of select information may include a storage element of the microprocessor. If so, additional circuitry may be provided for writing data from a register of the microprocessor to the storage element using one or more microprocessor instructions. Each multiplexer may be coupled to a different source of select information, or all multiplexers may be coupled to the same select information. A fixed set of interconnect traces may be provided to couple a fixed set of nodes to an additional set of externally-accessible chip pads. One or more M:1 multiplexers may also be provided, having their M inputs coupled to M different outputs of the N:1 multiplexers. Each of the M:1 multiplexers may be coupled to a second source of select information. Preferably, the outputs of the M:1 multiplexers will be coupled to a circuitry for facilitating debug and performance monitoring of the integrated circuit.
However, these systems require some compromise between on-chip storage capabilities, pins available for providing signal and data samples, and multiplexing required to provide plural outputs on each available pin. For example, a large on-chip storage capability means additional chip area dedicated to functions that may not be used or even made available after debugging and circuit operation is verified. Even when chip area can be spared, the stored test results must be accessed by providing some combination of pins and clock cycles to multiplex the results out onto the pins. As the number of pins available is often substantially less than the number of parameters, signals, and data bits to be provided, the onboard test memory must act as a buffer. However, as the circuit under test continues to operate and test results are generated more quickly than the pin count will permit providing thereof, the memory will rapidly continue to fill until, eventually, an overflow condition will be reached, data lost, and circuit operation compromised.
Accordingly, a need exists for a way to collect test and debug data from an integrated circuit without requiring an inordinate number of test output pins or on-chip test memory size.