A processor trace consists of information that is collected as a processor executes a program. A processor trace may provide a record of which instructions were executed, in what order they were executed, the speed at which they were executed and other aspects pertaining to a program's execution. In order to provide trace support in hardware, a typical processor may have one or more processor trace units as shown in FIG. 1.
In FIG. 1, an integrated circuit 100 (also referred to as a “chip”) is a multi-core processor chip that includes processors 101 that are each operatively coupled to corresponding trace logic units 103. Any number of processor/trace unit complexes may be implemented. The trace logic units 103 store the trace information to either a memory buffer such as main memory 105, or to a dedicated trace memory that may be either on chip or off chip, or to external trace port 107 which may be connected to an external trace capture unit (not shown). Each trace logic unit 103 is closely coupled with its corresponding processor 101 and examines signals from within that processor to determine the sequence of instructions being executed on that processor. Some details of a trace logic unit are shown in FIG. 2. The trace logic unit 200 has 3 major functions that include trace collection logic 201, trace filtering logic 203 and trace formatting logic 205.
The trace collection logic 201 monitors signals from the processor and records state information to be conveyed in the trace. State information includes, for example, completed instruction program counters, the address of load and store accesses to memory and other information that may be useful in the trace. The filtering logic 203 turns the trace on and off according to user defined parameters (i.e. filters). For example, a filter may specify that a trace should be turned on as soon as an exception handler is entered and turned off as soon as an exception handler completes. The filtering mechanism may be complex, consisting of a sequence of state dependent actions that result in the trace being turned on or off (e.g. wait for a particular program counter, followed by a load to a particular address, and then capture 100 instructions of trace information). In addition, the filtering mechanism can specify that the user only wants certain events, or types of instructions, to appear in the trace. For example, the user may specify that the trace should only contain data and instructions related to load or store operations.
The formatting logic 205 addresses, among other things, redundant information that may be contained in the collected trace information. That is, in order to efficiently store the trace information into either the main memory 105 or the external trace port 107, redundant information should be removed to conserve both space and bandwidth. Formatting operations may be lossless or lossy, depending on the use case.
The circuits required to implement trace logic units 103, having the three functions of collection, filtering and formatting, may be a non-trivial percentage of the total circuits required to implement the processor. More particularly, when the number of processors is large, the circuit overhead of the corresponding trace logic units may be beyond practical implementation. In addition, limited bandwidth and space is typically allotted to trace data such that it is impractical to generate a large number of traces simultaneously. Therefore only a subset of the trace logic units may be active at any given time.