Processing systems such as microprocessors are configured to fetch programs or instructions from memory and execute them. An embedded trace macrocell (ETM) is a hardware unit commonly included in microprocessors to trace code execution sequences. The ETM is configured to compress the execution sequence and transmit the information in packets, such that the execution sequence can be reconstructed.
Information pertaining to the execution sequence is very valuable for software debug and development. Additionally, information pertaining to the internal hardware state or microarchitecture state of the microprocessor would be desirable to debug any unexpected hardware behavior. Conventionally, the information pertaining to code execution sequences as generated by an ETM is transmitted through input/output pins of a chip and analyzed by debugger software. The internal microarchitecture state can be observed through tools such as an oscilloscope. The microarchitecture state can also be transmitted through the inputs/outputs pins of the packaged chip or with the help of some bonding pads in an unpackaged chip.
However, the above conventional approach suffers from several deficiencies. Firstly, the trace of code execution sequence (hereinafter, also referred to as “instruction trace” or “architecture state”) is not correlated with a microarchitecture state. In other words, the microarchitecture state and architecture state cannot be efficiently collated or juxtaposed such that the architecture and corresponding microarchitecture states may be observed together. Therefore, debugging unexpected behavior is a challenging task.
Secondly, the operational speeds that can be supported at the input/output pins of the chip are extremely limited, and especially low in comparison to normal operating speeds of the microprocessor. Accordingly, the microarchitecture state signals transmitted through the input/output pins to tools such as oscilloscopes may not correspond to real time operating speeds. Moreover it may not be possible or practical to observe fast changing signals, such as at operating speeds of the microprocessor, through an oscilloscope.
A third deficiency of the conventional approach includes the challenges associated with controlling timing and skew between the large number of signals transmitted through the input/output pins of the chip. The maximum number of input/output pins a chip can have is limited by the packaging constraints. Moreover, mapping the data observed through the pins to accurately correspond to a timeline of the microprocessor's code execution sequence is also very difficult. At best, it may be possible to capture information pertaining to the internal state within a limited time window because memory capacity of devices such as oscilloscopes is very limited.
Another major drawback of the conventional approach arises from the fact that it is very difficult to precisely control start and stop points for the limited time window for capturing information pertaining to the internal state. Accordingly, it becomes difficult to synchronize architecture and microarchitecture states.
In order to mitigate the above mentioned problems, there is a need in the art for methods and apparatus for efficiently correlating architecture and microarchitecture traces in microprocessors.