Integrated circuits, including devices incorporating embedded processors, require substantial testing (“debugging”) in order to assure proper functioning. Tracing is an often-used embedded processor debugging technique that involves capturing and analyzing data and/or program (“trace”) information generated within the processor core, and then transmitting the trace information through selected pins of the embedded processor device to a test (debug or “emulator”) system using a special interface (e.g., a special printed circuit board (PCB) having a socket). Trace operations are generally characterized as either static (post-process) trace operations, or dynamic (real-time) trace operations. Static tracing typically includes writing the trace information into a special on-chip memory while the program is being executed, and then off-loading the trace information after execution is completed. Real-time tracing involves temporarily storing trace information in a relatively small output buffer (e.g., a First-In, First-Out (FIFO) memory structure), and transmitting the trace information from the output buffer through associated device pins to an external debug system (e.g., a computer or workstation running appropriate debug software) while a program is being executed.
Although both real-time and post-process trace operations have beneficial aspects, the main advantage of real-time tracing over post-process tracing is that real-time tracing facilitates smaller device size. Unlike static traces that require a special on-chip memory, real-time trace operations facilitate smaller embedded processor devices because trace data is immediately transmitted off of the embedded processor device while the program is being executed. Further, unlike static tracing where the size of the special on-chip memory limits the amount of trace information that can be generated during a trace operation, the amount of trace information generated during real-time trace operations is theoretically unlimited. With static tracing, the only way to increase the amount of post-process trace information is to increase the special on-chip memory, which further increases chip size.
Despite the advantages of real-time trace operations over static trace operations, practical limitations exist that constrain the use of real-time tracing in some modern embedded processor devices. One such limitation is a possible mismatch between the rate at which trace information is generated by the processor core, and the rate at which the trace information is transmitted from the embedded processor to an external debug system. That is, modern embedded processors have internal clocking speeds of 400 MHz or more, which is often two, four, or more times faster than the transmission/processing speed of an external debug system. When a burst of trace information is too large and generated faster than it can be off-loaded to the external debug system, a buffer “over-run” error occurs in which subsequently generated trace information is unusable.
Two practical solutions to the buffer over-run problem associated with conventional embedded processor devices are to increase the size of the output buffer, and to increase the output rate from the output buffer by off-loading multiple trace information “words” in parallel. However, increasing the size of the output buffer undesirably increases chip size/cost, and only partially addresses the buffer over-run problem in that the output buffer can still be overwhelmed if large amounts of trace data are generated in a relatively short burst. In addition, increasing the output rate from the output buffer requires increasing the number of device pins dedicated to trace operations, which may not be possible in some embedded processor devices. That is, unlike static trace operations in which stored trace information can be transmitted serially, for example, through standard JTAG pins, real-time trace operations typically require a relatively large number of dedicated device pins to transmit trace information to an external debug system at or near the processor core frequency. With the recent trend toward 64-bit (or more) embedded processors having processor core frequencies of 400 MHz or more, a embedded processor designer must make a difficult choice between using device pins for debug operations and “normal operations”, and in some cases may not have sufficient pins to transmit real-time trace information. Although compression techniques such as those associated with IEEE-ISTO 5001™-1999 (the “Nexus 5001 Forum™ Standard”) have been used to reduce the demand for dedicated pins by reducing the amount of off-loaded trace information, these conventional compression techniques provide insufficient control over trace operations in many embedded processor applications, thereby leading to buffer over-runs that produce unusable trace information.
What is needed is a configurable trace port for embedded processors that avoids the buffer over-run problems associated with conventional real-time trace circuits. What is also needed is a configurable trace port that supports a wide range of embedded processor devices and debug systems.