It is known to perform tracing of activities of a data processing apparatus in order to verify processor design and to confirm reliable operation of the data processing apparatus when executing program instructions.
It is known to provide a trace unit in association with the data processing apparatus, the trace unit being configured to monitor the processing activities of the data processing apparatus and to generate a sequence of trace data items indicative of those processing activities. An example of such a trace unit is the ARM Embedded Trace Macrocell (ETM) that can be provided either as part of a single System-On-Chip or independently from the processor. The ETM generates trace data for output to the diagnostic apparatus. For modern data processing apparatuses running complex software, the volume of trace data generated during the trace operation is typically very large. Accordingly, it is desirable to provide items of trace data in a compressed form, omitting any information that is expected to be redundant and including data that is only strictly necessary for the particular analysis purpose. U.S. Pat. No. 7,707,394 sets out some techniques for reducing the size of trace data stream.
Tracing of activities of a data processing apparatus can be complex in a data processing apparatus capable of out-of-order execution of program instructions and/or speculative execution. Speculative execution is a technique often employed in data processing apparatuses because it can improve instruction throughput, for example, by preventing pipeline stages of a pipelined data processing apparatus from remaining idle for any significant period of time. However, speculative execution of instructions can present a tracing unit with particular difficulties because, until the speculation is resolved, i.e. until it is known whether a given instruction that was speculatively executed is actually committed by the data processing apparatus, the trace unit is unable to provide a stream of trace data that definitively indicates the actual operation of the data processing apparatus.
Known techniques for dealing with tracing in a data processing apparatus capable of speculative execution are to buffer all of the trace data associated with speculatively executing instructions until the speculation is fully resolved, or to generate and output trace data speculatively and to cancel certain items of the trace data if it is subsequently found that the instructions to which they corresponded were mis-speculated. For example, the Nexus protocol (“The Nexus 5001 Forum-Standard for a Global Embedded Processor Debug Interface”, IEEE-ISTO 5001-2003, 23 Dec. 2003) supports cancelling a specified number of trace data items. However, even if data processing apparatus specifically indicates to the trace unit which instructions or groups of instructions should be cancelled, actually identifying the items of trace data that correspond to those cancelled instructions is non-trivial.
In a data processing apparatus capable of out-of-order execution problems can arise in tracing the data processing activities when, for example, dealing with execution of instructions such as load or store instructions, which can take many cycles to complete. Thus, for example, even when in-order processing is performed if a load instruction is executed and a corresponding item of trace data is generated, by the time requested data value has been retrieved from memory system it can be difficult to identify the corresponding item of (previously generated) trace data associated with execution of the load instruction. Thus there can be a problem in correlating data values retrieved from memory with the particular executed load instructions. It will be appreciated that this situation is exacerbated when data transfers such as load instructions can be performed out of program order, which can make it virtually impossible to identify which data values belong to which memory addresses. Some background technical information regarding the tracing of out-of-order processors can be found in the document “The PD Trace Interface and Trace Control Block Specification”, 4 Jul. 2005 (available from http://www.mips.com/products/product-materials/processor/mips-architecture/) and in the ARM ETM v3 architecture (available from http://infocentre.arm.com).
A particular problem can arise in tracing of conditional instructions because there is typically a delay between decoding of a conditional instruction and resolution of the particular condition attached to execution of the instruction. Many known instruction sets only allow branches to be executed conditionally. However, the ARM architecture uses conditional evaluation hardware that enables a variety of different instructions to contain a condition field that determines whether or not the data processing apparatus will execute the corresponding instruction. Non-executed instructions typically consume only a single processing cycle. The ability to execute a number of different instructions conditionally removes the need for many branch instructions. Branch instructions can stall the pipeline of a data processing apparatus requiring a plurality of cycles to refill the pipeline and conditional instructions allow for dense in-line code without branches. The time penalty of not executing several conditional instructions (where the attached conditions are not satisfied) is frequently less than the overhead of the branch instructions that would otherwise be needed. Accordingly, conditional instructions are very useful in improving the efficiency of data processing.
However, conditional instructions such as conditional non-branch instructions present a particular problem for tracing activity of the data processing apparatus due to the delay between decoding of the instruction and evaluation of the attached condition. The conditional pass/fail information could be traced at the same point as the conditional instruction and hence be traced using a single trace packet, but this requires significant buffering to support such tracing, particularly in an out-of-order processor or processor capable of speculative execution.
Accordingly, there is a requirement to provide a technique that offers more efficient tracing of conditional instructions that is also applicable to the tracing of instruction sequences in a data processing apparatus capable of speculative and/or out of order execution.