A processor in a computer operates on instructions with no indication of what is happening internally, except for external signals on I/O pins. The contents of registers and cache within the processor may be assumed, if the processor is functioning properly, but are usually unknown, unless a specific request is made to read such information. Since the internal functions of the processor are effectively hidden, if a hardware or software error occurs during execution of a program, it is often difficult or time-consuming to determine whether the cause of the error is in the processor, in some other component of the computer or in the program instructions.
Computer program execution tracing is a useful technique for locating hardware and software errors in the performance of a computer by generating, or “capturing,” a “trace” of executed program instructions. The program execution trace may also log certain events as they occur, a so-called event-based profiling technique. The program execution trace is essentially a listing of the executed instructions, called subroutines and accessed resources and sometimes the results thereof. This technique may be used, for example, in a power-on self test (POST) of the computer to discover errors in the performance of the processor, the firmware or the system board. This technique may also be used after POST to discover errors in programs or peripheral devices operating in the computer.
Some variations in program execution tracing use logic analyzers, in-target probes (ITPs) or in-circuit emulators (ICEs) to view executed instructions or to generate the program execution traces. Each of these devices has various benefits or uses. However, in addition to the cost of these devices, each also has limitations.
The logic analyzer monitors signals within the computer, such as signals on a bus, the I/O pins of a processor or another component in the computer. The logic analyzer can capture the state of the signals at any given moment and can capture a trace of the signals to record changes in the state of the signals over a period of time. The logic analyzer does not, however, control the computer or issue commands to get specific data. Thus, a significant limitation in logic analyzers is that the captured traces are dependent on the external signals of the processor, or other component, being monitored. The internal workings of the processor, such as the state of the registers or the cache, remain hidden. Thus, when the internal cache of the processor is enabled, many instructions cannot be captured. Additionally, significant manual translation and filtering must be done to correlate the captured signal data to actual instructions executed.
An ITP or an ICE enables debugging of the computer, the processor or the program during hardware/software development not only by monitoring the I/O pins or bus signals, but also by controlling the processor, bus or other component to which it is connected. Thus, not only does the ITP or ICE intercede between the desired component (e.g. the processor) and circuit board to intercept and/or sense some or all of the signals from the component, but the ITP or ICE can also issue commands to the component. For example, the ITP or ICE can request data from the registers of the processor in addition to displaying a current state of the signals on the I/O pins. The ITP or ICE cannot, however, access the cache, and the less expensive ITPs or ICEs cannot capture a trace of the executed instructions. The ITP or ICE can be used to manually step through each instruction, but this process is very slow. Additionally, some ICEs have some trace capture ability that only runs off a particular bus that the ICE is monitoring, so the ICE captures only the bus activity.
Each of these devices (the logic analyzers, ITPs and ICEs) is used within a laboratory setting. In other words, they are used to debug computers, computer components and programs under development by a manufacturer or that have reported errors in the field and have been returned by a consumer. Due to the cost and size of the logic analyzers, ITPs and ICEs, these devices are almost never taken out of the laboratory setting to analyze a computer, component or program in the field.
In order to view the contents of the cache and other internal workings of the processor, special “bond-out” versions of integrated circuit chips have been produced. The bond-out chips resemble the standard versions of their integrated circuits, but have special pins, and sometimes complete buses, that make “internal” signals available at special external bond-out interfaces. The bond-out features, however, take up valuable space in, and can affect the operation of, the integrated circuit. Additionally, special devices and programs are needed to decode and give meaning to the signals provided at the special bond-out interfaces.
Another technique for monitoring internal functions of the processor involves an “on-chip trace cache” and supporting circuitry within the integrated circuit of the processor. Trace information is captured in the on-chip trace cache during operation of the processor. Afterwards, the captured information can be downloaded and analyzed. This technique, however, takes up valuable space within the integrated circuit.
Another technique to analyze the performance of a target computer, but which does not necessarily incorporate additional devices (e.g. the logic analyzers, ITPs and ICEs) or additional on-chip circuitry, is “instrumented source code.” In this technique, executable “tag statements” are inserted into various branches and locations of source code, thereby “instrumenting” the source code. After the source code has been compiled and linked, the tag statements are executed along with the rest of the code. As each tag statement is executed, it performs an operation that can be either detected by an analysis device or recorded within the target computer for later examination. For example, each tag statement may write a value to different addresses so that the contents of the addresses provide an indication of which tag statements were executed and in what order. The general flow of the software is thus indicated by the contents of the addresses.