Threads are commonly used in computer programs to achieve concurrency in computer systems. Very generally, a thread is a sequence of computer-executable instructions that executes independently of other sequences of instructions. Many modern computer processors and computer operating systems are capable of scheduling and executing multiple threads concurrently and allowing threads to interact through shared resources or shared program state. A computer program that executes with multiple threads of execution is often said to be a “multi-threaded” computer program.
Among the types of computer programs that can be multi-threaded are computer programs that execute on some virtual machines. Such programs typically contain instructions that are interpreted at runtime and executed by a virtual machine that interfaces with the operating system of the physical hardware on which the virtual machine executes. The virtual machine is an entity that executes machine language instructions in response to the interpretation of corresponding virtual machine instructions, such as bytecode. The Java Virtual Machine (JVM) is an example of such a virtual machine, although there are several virtual machines capable of executing such programs. Components of the .NET Framework available from the Microsoft Corporation of Redmond, Wash. are other examples of such a virtual machine.
It is often desirable to be able to replicate the execution of a virtual machine computer program in a manner such that the executing program exhibits, during replay of a recorded program execution, the same behavior that the program exhibited when the program originally executed. For example, a programmer might wish to record a virtual machine computer program's behavior in a production environment and then replay that program in a debugging environment in order to locate and fix possible errors in the program code—errors that caused unexpected or undesirable results in the production environment. Under such circumstances, if the program behaves differently when replayed in the debugging environment than the program behaved in the production environment, the programmer may have great difficulty in isolating the source of the problems that were previously encountered.
One aspect of a multi-threaded virtual machine application program that may behave differently when replayed is thread context switching by the operating system underlying the virtual machine. In this context, thread context switching refers to the time at which a processor is switched by the operating system from executing one thread to executing another. This thread context switching is largely non-deterministic because this switch time is not readily predictable. Consequently, there is no inherent guarantee that the order of virtual machine application code executed by multiple threads will be the same during separate executions of the virtual machine application program. If the order of application code performed by multiple threads is different when replayed in the debugging environment than the order in the production environment, the programmer may have great difficulty in isolating the source of the problems that were previously encountered, especially if the source of the problems is dependent on the order of application code executed by the multiple threads. Further, if the order of application code performed by multiple threads is different when replayed in the debugging environment than the order in the production environment, information recorded about the program's behavior in the production environment may become “out-of-sync” during replay of the program in the debugging environment.
One source of program bugs and errors in virtual machine application programs that is dependent on the order of application code executed by multiple threads are race conditions. Very generally, a race condition occurs when an output or result of computer program depends on the sequence or timing of execution events. A race condition can arise during execution of a multi-threaded virtual machine program because of thread context switching. For example, two threads may update a shared data structure at nearly the same time, and the program may execute correctly only when the two threads perform updates in one and only one order. If, because of thread context switching, the order of application code performed by multiple threads is different when replayed in the debugging environment than the order in the production environment, then a race condition that occurred when the program was recorded in the production environment may not be accurately reproduced when the program is replayed in the debugging environment. This is problematic for computer program developers and testers if the source of a bug in the program is the race condition because the program when replayed may not exhibit the race condition previously encountered in the production environment.
The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.