1. Field of the Invention
The present invention is directed in general to data processing systems. In one aspect, the present invention relates to a tracing mechanism and methodology for debugging data processing systems.
2. Description of the Related Art
Debugging processes often use tracing techniques to capture and analyze data and/or program information (referred to as “trace” information) for purposes of understanding the memory operation of the program flow. The trace information is typically obtained from a data processing system with an external test (debug or “emulator”) system which uses a debug communication protocol to communicate trace information from the data processing system through selected pins of the data processing system to the external test system using a special interface (e.g., a special printed circuit board (PCB) having a socket). Providing debug information in real-time, without intrusion on the normal operation of the data processing system, is highly desirable in order for the actual debug operations to remain transparent to operation of the system. One example of a debug communications protocol is the IEEE ISTO-5001 NEXUS debug standard which is used by a debugger operably coupled to the data processor undergoing debug. The NEXUS debug standard defines a number of debug capabilities to monitor program execution by providing visibility into program flow and data flow. This visibility consists of a sequence of information messages provided over a dedicated multi-bit or multi-terminal serial interface or auxiliary port to an external development system. Program flow messages are then combined with a static image of the program to reconstruct the actual instruction execution sequence of the data processor under test. Data flow messages track processor reads and writes to pre-defined address ranges. In a conventionally designed processor, data trace information is obtained by snooping the system bus for qualified memory transactions. For processors with a cache memory hierarchy, data trace with visibility beyond the cache is required to provide correct representation of the memory operations in the instruction flow. Typically, the transactions between the processor and cache memory management unit are observed, and qualified data accessed are traced. In both scenarios, the data trace can be correlated with the instruction trace by providing program correlation information at the event of the data trace. To this end, the NEXUS debug standard provides a Program Correlation Message (PCM) which identifies a qualified data trace access by inserting into the instruction trace the corresponding instruction count between the last branch instruction and the qualified data trace access, thereby enabling the instruction trace and data trace to be correlated.
For high performance data processing systems, there are practical limitations exist that constrain the use of real-time tracing. One such limitation occurs with superscalar out-of-order embedded processor designs where data traces and instruction traces are not properly associated. For example, if a storage buffer or a load store unit (LSU) reservation station for outstanding cache accesses is used to handle the speculative data access due to out of order execution, the observed data accesses at the cache memory management unit may not correlate with the precise boundary of the instruction flow at completion, particularly when instruction trace information is compressed to reflect only branch instructions. Uncorrelated instruction trace and data trace information can cripple the effectiveness of the real-time trace data to the external debugger. Furthermore, the cache design may be non-blocking so that the subsequent cache access could bypass the earlier cache access if there is no data dependency and the earlier cache access is halted due to the long latency access event such as cache miss. This will seriously impact the usefulness of the data trace for high performance out of order processor.
Another limitation with real-time tracing is a possible mismatch between the rate at which trace information is generated by the data processor, and the rate at which the trace information is transmitted from the data processor to an external debug system. For example, current embedded processors have internal clocking speeds of 400 MHz or more that are many times faster than the transmission/processing speed of an external debug system. When a burst of trace information is too large and generated faster than it can be off-loaded to the external debug system, a buffer “over-run” error occurs in which subsequently generated trace information is unusable. Accordingly, there is a need for an improved system and methodology for efficiently tracing and correlating data trace and instruction trace information which overcomes the problems in the art, such as outlined above. Further limitations and disadvantages of conventional processes and technologies will become apparent to one of skill in the art after reviewing the remainder of the present application with reference to the drawings and detailed description which follow.