1. Field of the Invention
The present invention is concerned with monitoring the activities of a data processing unit, in particular generating items of trace data indicative of those processing activities of the data processing unit.
2. Description of the Prior Art
The complexity of modern data processing apparatuses (such as microprocessors) means that programming and debugging the operation of such data processing apparatuses is a complicated and time-consuming task. As such, it is extremely useful to a programmer seeking to correctly configure a data processing apparatus to be able to monitor the operation of the data processing apparatus as it carries out its processing activities, in order to verify that those operations are being carried out as desired and to troubleshoot problems as they occur.
This desire to monitor the processing activities of a data processing apparatus must be balanced against the knowledge that contemporary data processing apparatuses are typically configured as very small scale devices, such as a System on Chip (SoC). It is well known that space constraints in such devices are an extremely important factor in their construction, and hence the opportunities for adding monitoring components to such devices are very limited. Similarly, the pins on the periphery of a SoC are also at a premium, constraining the amount of diagnostic data that may be exported from the SoC for external analysis.
For these reasons, it is known to provide a trace unit in association with such data processing apparatuses, the trace unit being configured to monitor the processing activities of the data processing apparatus and to generate a sequence of trace data items indicative of those processing activities. In particular, in order to reduce the bandwidth of data which must be transferred, it is known to provide the items of trace data in a highly compressed form, omitting any information that is redundant and only including data which is strictly necessary for the current analysis purpose. U.S. Pat. No. 7,707,394 sets out some techniques for reducing the size of a data stream produced during instruction tracing.
The difficulties associated with tracing the activity of a data processing apparatus are accentuated if the data processing apparatus is capable of speculative instruction execution. It is known to provide speculative instruction execution because of the opportunities this technique provides for faster operation, for example by avoiding pipeline stages being idle. However, speculative instruction execution presents a trace unit with a difficulty, since until the speculation is resolved (i.e. it is known whether a given instruction was actually committed), the trace unit is unable to provide a stream of trace data which definitively indicates the operation of the data processing apparatus. One possibility is for the trace unit to buffer the trace data it generates until speculation is resolved, but the buffer space that this technique requires can become undesirably large if the speculation depth of the processor is significant. An alternative technique is also to generate the trace data speculatively, and then to cancel certain items of trace data if it is subsequently found that the instructions to which they correspond were mis-speculated. For example, the Nexus protocol (“The Nexus 5001 Forum—Standard for a Global Embedded Processor Debug Interface”, IEEE-ISTO 5001-2003, 23 Dec. 2003) supports the cancelling of a specified number of trace data items.
However, even if the data processing apparatus specifically indicates to the trace unit which instructions (or more typically groups of instructions) should be cancelled, identifying the items of trace data which correspond to those cancelled instructions is not straightforward. Groups of instructions are typical of speculative execution since only some instructions can result in a change in program flow, and hence groups of instructions can be identified wherein if the group is executed at all, then the whole group will be executed.
A particular issue arises with the tracing technique of only generating trace data for selected instructions, since this can make the generation of a corresponding cancelling item of trace data difficult, because if (for example) the data processing apparatus indicates that the most recent two groups of instructions should be cancelled, the trace unit cannot simply indicate to the downstream analysis unit that two items of trace data should be cancelled, since there is no such direct correlation between groups of instructions and the number of items of trace data generated.
Another problem associated with the difficulty in later identifying particular items of trace data arises in the context of data transfers. Since a load or store operation can take many cycles to complete, when (for example) a load instruction is executed and a corresponding item of trace data is generated, by the time the requested data value has been retrieved from memory it is difficult to identify that corresponding item of trace data, generated many cycles earlier, with which the data value should be associated. If the processor can execute instructions or perform data transfers out of program order, then there may be no way for the trace unit to identify which data values belong to which data addresses.
Some background technological information about tracing out-of-order processors can be found in “The PD Trace Interface and Trace Control Block Specification”, 4 Jul. 2005 (available from http://www.mips.com/products/product-materials/processor/mips-architecture/) and in the ARM ETMv3 architecture (available from http://infocenter.arm.com).
Consequently it would be desirable to provide an improved technique for generating items of trace data, which would allow a trace unit to address the above described problems.