When writing code during the development of software applications, developers commonly spend a significant amount of time “debugging” the code to find runtime errors in the code. In doing so, developers may take several approaches to reproduce and localize a source code bug, such as observing behavior of a program based on different inputs, inserting debugging code (e.g., to print variable values, to track branches of execution, etc.), temporarily removing code portions, etc. Tracking down runtime errors to pinpoint code bugs can occupy a significant portion of application development time.
Many types of debugging applications (“debuggers”) have been developed in order to assist developers with the code debugging process. These tools offer developers the ability to trace, visualize, and alter the execution of computer code. For example, debuggers may visualize the execution of code instructions, may present variable values at various times during code execution, may enable developers to alter code execution paths, and/or may enable developers to set “breakpoints” and/or “watchpoints” on code elements of interest (which, when reached during execution, causes execution of the code to be suspended), among other things.
An emerging form of debugging applications enable “time travel,” “reverse,” or “historic” debugging, in which execution of one or more of a program's threads are recorded/traced by tracing software and/or hardware into one or more trace files. Using some tracing techniques, these trace file(s) contain a “bit-accurate” trace of each traced thread's execution, which can be then be used to replay each traced thread's execution later for forward and backward analysis. Using bit-accurate traces, each traced thread's prior execution can be reproduced down to the granularity of its individual machine code instructions. Using these bit-accurate traces, time travel debuggers can enable a developer to set forward breakpoints (like conventional debuggers) as well as reverse breakpoints during replay of traced threads.
One form of hardware-based trace recording records a bit-accurate trace based, in part, on recording influxes to a microprocessor's cache (e.g., cache misses) during execution of each traced thread's machine code instructions by the processor. These recorded cache influxes enable a time travel debugger to later reproduce any memory values that were read by these machine code instructions during replay of a traced thread.
Modern processors are often not sequentially-consistent in their memory accesses, in order to ensure that the processor can stay as busy as practical. As a result, modern processors may reorder memory accesses relative to the order in which they appear in a stream of machine code instructions. One way in which modern processors may reorder memory accesses is by executing a thread's machine code instructions out-of-order (i.e., in a different order then the order the instructions were specified in the thread's code). For instance, a processor may execute multiple non-dependent memory loads and/or stores simultaneously across parallel execution units, rather than one-by-one as they appear in a thread's instructions. Another way in which modern processors may reorder memory accesses is by engaging in “speculative” execution of a thread's instructions—such as by speculatively pre-fetching and executing instructions after a branch prior the condition(s) that determine the outcome of the branch actually being known. Out-of-order and/or speculative execution of a thread's instructions means that the memory values relied upon by these instructions may appear in the processor's cache at times other than when a memory accessing instruction appears to have committed from an architectural perspective (and are thus reordered). In addition, the very act of speculatively pre-fetching instructions alters contents of the processor's cache, even if those instructions are not actually executed, and even if they do not access memory. The degree to which a given processor engages in out-of-order and/or speculative execution can vary depending on the instruction set architecture and implementation of the processor.