1. Technical Field of the Invention
The embodiments of the invention relate to performing diagnostics on an integrated circuit and, more particularly, to a scheme to trace instruction flow in an integrated processor.
2. Description of Related Art
Historically, microprocessors were non integrated devices, where a system bus was exposed and connected to support logic that contained a memory controller to control main memory (such as a random-access-memory or RAM) and program memory (such as a programmable read-only memory or PROM). This simpler design allowed a logic analyzer to be placed on a processor bus and the address bus accesses could be monitored and/or captured, along with decoding control signals indicating an instruction fetch, to trace an instruction or a program counter. The analyzer could be stopped and a user could look back in the trace and determine program flow.
Tracing instruction or program flow is helpful in debugging a program running on a processor. Typically, it is helpful to know where the program counter (PC) points in order to correctly associate high level source code. Tracing instruction fetches allow a captured history of where the code has executed from and how it may have reached the present instruction. This is often helpful and sometimes necessary in order to remove programming errors.
However, with the advent of integrated processors, many of these integrated circuits (“chips”) now include at least one level of cache memory (typically referred to as a Level 1 (L1) cache) on the processor chip. In some other integrated processor designs, secondary cache or caches may also be included on chip. For example, an integrated circuit which includes a complete system on chip (SOC), may have both L1 cache and a second level (L2) cache on chip. With such integration, not all instructions fetches are visible external to the processor, thereby presenting additional requirements to trace the instructions.
One of the desired requirements in debugging is to allow the processor to run at or near its real-time speed. Thus, substituting an emulator for the processor to debug is not always a viable solution, since emulators typically run much slower than the rated frequency of the processor itself. However when utilizing the integrated processor itself instead of an emulator, not all instruction fetches are visible externally, even with the use of special lines and pins to observe the functionality of the processor. For example, in an integrated processor utilizing an integrated L1 instruction cache, tracing may be obtained off of the system bus, but this set up only captures those instructions which miss in the L1 cache. To place some form of tracing unit or logic between the processor and the L1 cache to capture those instructions that hit in the L1 cache, may most likely result in a reduction in the maximum operating speed for the chip.
Accordingly, there is a need to obtain tracing of instructions (or program flow), including those instructions that hit in the L1 cache, and to obtain the tracing while the processor operates at or near real-time speed.