A primary technique used in the development of complex high performance microprocessors is trace driven modeling. Trace driven modeling involves the use of traces of significant events that occur in the course of executing programs of interest to drive a software model of a proposed hardware mechanism in order to determine how it behaves on such a sequence. A trace may contain various levels of information. An address trace, for example, contains the sequence of memory addresses used to access instructions and operands. An instruction trace contains instruction op-codes and register specifiers in addition to the sequence of memory addresses. Other traces may contain only branch instructions or exceptions. The information in a trace may thus take various forms that captures the sequence in which the events occur.
A primary benefit of the use of trace driven modeling is that the model does not need to understand how to execute the instructions in a program in order to determine proper program flow and model the behavior of the program. The actual flow is captured in the trace. This allows use of relatively simple models, thereby speeding up operation of the modeling and allowing a variety of options to be explored in a reasonable time. Generally speaking, a full instruction trace is the most useful for developing microprocessors. This allows modeling of all the features of a proposed design at once to take into account possible interactions between the various mechanisms and most accurately reflect the overall behavior of a design. Moreover, subsets of events can usually be derived from a full trace for more focused modeling.
Three primary methods for collecting traces are used. A first mechanism is to use the instruction stepping mechanism processors use for debug purposes. The instruction stepping mechanism typically allows debugged software to gain control after each instruction when a program of interest is executed. The debugged software can be set up to write information on the execution of each instruction to a file, thereby collecting a trace of program execution. The EFLAGS TF bit of the X86 architecture is such a mechanism. However, this mechanism runs very slowly, requiring many instructions in the debugged software to be executed for each instruction in the program being traced. In addition, the system software typically cannot be traced in this fashion, due to conflicts in the use of system resources between the system software and the debugged software. Traces are typically limited to a specific program and therefore cannot reflect the overall behavior of a system in which several programs and much system code may be running.
A second method of collecting traces is to add instructions to a particular program that generate a trace of certain information as the program executes. By using this information, typically in conjunction with the binary image of the program, a detailed trace of execution may be constructed. The trace generating instructions are typically inserted by a compiler. However, in order to use this method, the user must have accessed the source code. Furthermore, only a single program can be traced which means that related system activity will not be traced.
A third common method of collecting traces is to collect the cycle-by-cycle pin state of a processor as it executes code using a device such as a logic analyzer. This method is called a bus trace. Traces can be collected at normal processor speed at least until the capacity of the trace storage device has been reached. Advantages of this method are that it reflects real-time operation and is transparent to the system. Accordingly, the behavior of system code and all active programs can be captured. However, the external hardware required is often expensive. In addition, for a pipeline processor the bus activity for instruction fetching is asynchronous to the data references that the instructions make and it can be difficult to link an instruction with its memory references. Moreover, instructions may be fetched but not actually executed to the branches. Finally, bus activity reflects physical memory addressing. In a system which employs virtual memory, it may be desirable to have the virtual addresses for modeling related mechanisms such as translation lookaside buffers.
A fourth method, less commonly used, but possible on microcoded microprocessors, is to modify the microcode of each instruction such that as each instruction executes it generates an information record which is captured in the some matter such as writing it to memory. This typically has much less time overhead than debug software based tracing, and since it operates below the instruction level it is transparent to the system software thereby allowing the system software to be traced as well. However, this technique can only be used with a writeable microcode memory making it unsuitable for commercial microprocessors which must use higher density hardwire microcode memory due to space constraints.
Accordingly, there is a need for a mechanism that permits a more broad scope of instruction tracing while not possessing the inherent disadvantages set forth above.