As is well known, a trace is a sequence of logged information that indicates what events (e.g., instruction executions) have occurred while a program is running. When computer architects want to gather information about a running computer system, trace-driven techniques are often used to collect run time information (“execution traces”) from a workload. A conventional trace collection system often uses specialized hardware or software that monitors and logs every instruction executed by the computer system. Offline, the execution traces are analyzed in detail, and are useful for applications such as debugging, fault tolerance, and performing various simulations.
Existing trace collection systems have several shortcomings. One shortcoming of existing trace collection systems is that they are unable to maximize trace completeness and detail level efficiently—a more detailed trace provides more information about the execution's internal state, thereby enabling a wider range of analysis of the running system. Conventional trace collection systems typically use software based techniques that incur increasingly high run-time overhead as trace detail increases. Other conventional trace collection systems use hardware based methods that require expensive and system-specific hardware probing devices.
Another shortcoming of existing trace collection systems is that they often introduce significant trace distortion. For example, a conventional trace collection system may introduce extra memory references in a memory trace that results in an inaccurate representation of a running program. Conventional trace collection systems may further introduce time dilation and memory dilation to a traced program. This occurs when tracing causes a program to run slower or to consume more memory.
Finally, existing trace collection systems usually operate continuously for only a short period of time due to the high bandwidth of the resulting trace data. This prevents long running executions from being traced, and generates large trace files that are difficult to store or share.