1. Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to instruction tracing mechanisms within microprocessors.
2. Description of the Relevant Art
Superscalar microprocessors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term "clock cycle" refers to an interval of time accorded to various stages of an instruction processing pipeline within the microprocessor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or falling edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term "instruction processing pipeline" is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Although the pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
Microprocessor designers often design their products in accordance with the x86 microprocessor architecture in order to take advantage of its widespread acceptance in the computer industry. Because the x86 microprocessor architecture is pervasive, many computer programs are written in accordance with the architecture. X86 compatible microprocessors may execute these computer programs, thereby becoming more attractive to computer system designers who desire x86-capable computer systems. Such computer systems are often well received within the industry due to the wide range of available computer programs.
Superscalar microprocessors typically speculatively execute instructions. Accordingly, it is difficult to determine the actual order of instruction execution of a software program. One technique called instruction tracing develops a dynamic profile of software being executed by a microprocessor. The dynamic profile indicates the order of instructions executed by a microprocesor. The dynamic profiling information can be used in future processor development and can be used to optimize software with respect to interaction with other systems modules like the operating system.
Microprocessors typically provide little hardware support for tracing. When an instruction to be traced is encountered, the microprocessor typically halts the execution of instructions. The state of the processor is then read from the microprocessor by external hardware. For example, the state of the processor can be read via a serial scan path. A serial scan path is a daisy chain connection of the registers of an integrated circuit. The end of the daisy chain is an external pin. The state of each register is detected by serially clocking the state information through the daisy chain. The state information is shifted one position each clock cycle until the entire state has been shifted out of the microprocessor. One example of a serial scan path is defined by IEEE Standard 1149.
Unfortunately, tracing in conventional microprocessors requires special hardware support. When the microprocessor halts execution, special hardware, such as serial scan hardware, is required to detect and save the state of the microprocessor. Additionally, saving the state of the microprocessor is a relatively slow process. A serial scan path serially outputs the state of each register in the microprocessor and selectively stores the desired state information from that data. Because the microprocessor typically contains a large number of registers that must be scanned out, the process of serially scanning the state of each register in the microprocessor is a relatively slow process.