Modern processors, such as, for example, central processing unit (CPU) chips, often contain multiple processing cores, multiple levels of cache hierarchy and complex interfaces between many different blocks of logic. Attempting to debug failures in this environment may be very difficult and time consuming. Often scan dumps, which provide an instantaneous view of the state of the processor, may provide some insight as to the cause of a failure when one is detected. However, many times, the events that cause the failure actually occur much earlier than (prior to) the point at which the failure is detected and the state is captured. As a result, the processor state captured via scan at the time of detection contains little or no useful information regarding the cause of the failure.
An additional tool to help debug failures is a trace capture buffer (TCB) that keeps track of the sequence of memory references that the processor makes. The TCB may record a limited sequence of transactions arriving at the memory system. This buffer may either be written in a loop where older transactions are replaced by new ones when the buffer is full, or the processor may be paused and the buffer written to DRAM memory to extend its storage capability.
Even with the TCB, there are limits to the debug information that may be captured. For example, cache memory (e.g., the L3 cache), which is located between the processing cores and the memory system, may complete some core requests autonomously without generating any transactions to the memory system. As a result, these operations, which may be important for debugging a particular failure, do not reach the TCB and are unobservable.