The present invention relates to generally to data processing systems for executing multithreaded programs, and more particularly to data processing systems in which bytecodes are executed on multiprocessors that implement a weakly consistent memory model.
In traditional data processing systems, computer programs exist as platform-specific, compiled object code within computer system memory or other computer storage media. More recently, however, some data processing systems have implemented language models designed to support multiple host architectures.
For example, JAVA® is an object-oriented programming language and environment, in which data is represented as objects, and in which methods are defined to manipulate those objects. Java is a trademark of Sun Microsystems, Inc. Java is designed to support applications for many types of computer systems with different central processing units and operating system architectures. To enable a JAVA application to execute on different types of data processing systems, it is typically compiled into a system-independent format. The compiled code consists of bytecodes, which are instructions that are not specific to any particular computer architecture, and which are designed to be executed on any computer system with an appropriate run-time environment.
In some data processing systems, a JAVA virtual machine (JVM) is provided to control the execution of bytecodes. The JVM is abstract computing machines, which like a real computing machine, has an instruction set and manipulates various memory areas at run-time. The JVM does not assume any particular implementation technology, host hardware, or host operating system. The JVM recognizes a particular binary format known as the “class” file format. A class file contains the bytecodes associated with an application or program, as well as a symbol table and other ancillary information.
The JVM will typically also include a Java interpreter, which is a module that alternately decodes and executes individual bytecodes. The interpreter, however, does not examine entire programs to obtain optimizations such as those that may be provided by some traditional compilers. Even frequently executed code must be reinterpreted each time it is invoked. As a result, in performance-critical environments, just-in-time (JIT) compilers may also be employed to dynamically translate bytecodes, typically of one or more methods, into native code consisting of instructions of the machine where the code is to be executed. The JVM retains the native code associated with these methods, and next time one of these methods is invoked, the JVM executes the native code associated with the invoked method instead of relying on the interpreter to interpret the method's bytecodes one at a time.
In operation, a JVM (with its interpreter and/or JIT compiler) is expected to properly execute a Java program that is written in accordance with the Java Language Specification. It is expected that the JVM should neither crash nor produce incorrect answers when executing a correctly written Java program. Furthermore, even if a Java program is not written correctly, it is expected that the JVM will report errors appropriately and possibly abort the execution of the program, rather than enter into a state (e.g., crashed) in which it can no longer continue to respond.
In particular, as a JVM can support many threads of execution at once, certain problems can arise that affect the ability of a JVM to properly execute a program (originally written in Java or in some other programming language, for example) where the program is multithreaded. Threads independently execute code that operates on values and objects residing in a shared main memory. Threads may be supported in a data processing system by having many hardware processors, by time-slicing a single hardware processor, or by time-slicing many hardware processors, for example.
The Java programming language, for example, supports the coding of programs that, though concurrent, still exhibit deterministic behavior, by providing mechanisms for synchronizing the concurrent activity of threads. The Java memory model described in the second edition of the Java Language Specification provides rules that impose constraints on implementations of the Java programming language, and specifically on how threads may interact through memory. These rules, however, also allow for some flexibility in order to permit certain standard hardware and software techniques that might greatly improve the speed and efficiency of concurrent code. For example, an optimizing compiler may be adapted to perform certain kinds of code rearrangement intended to improve performance while preserving the semantics of properly synchronized programs.
The concurrent execution of multithreaded programs on multiprocessor systems may result in some unique problems. Many shared memory multiprocessors in current data processing systems implement a weakly consistent memory model, rather than a strongly consistent model (e.g., sequential consistency) that imposes strict constraints on the order in which operations on memory are to be performed. In implementations of a weakly consistent memory model, higher performance can generally be achieved. However, weakly consistent memory models can also produce surprising results when multithreaded programs are not properly synchronized.
These results can be particularly severe in object-oriented languages and in languages (e.g., Java) that make safety guarantees. In particular, with respect to object-oriented programs, the severity of certain results may be attributed to the fact that a number of “hidden” data structures are usually manipulated by the runtime system (e.g., the virtual function table). On a multiprocessor system in which a weakly consistent memory model is implemented, multithreaded programs may give rise to issues with an object's type safety concerned with “hidden” data, for example.
In one instance, type safety would be violated if a processor attempts to read a value in an object's field representing an object's type that is supposed to contain a valid reference or pointer, but sees a garbage value instead. Accordingly, such violations of type safety could result in a crash of the virtual machine executing the program. This could arise in situations where the value corresponding to the reference is to be stored as “hidden” data associated with the object, but where an attempt to read that value is made before the value is actually stored. Unfortunately, such a sequence of events could occur in certain executions of a multithreaded program on a multiprocessor system in which a weakly consistent memory model is implemented.
Furthermore, on a multiprocessor system in which a weakly consistent memory model is implemented, multithreaded programs may also give rise to issues with an object's initialization safety. With respect to object-oriented programs, if an object is not made visible outside of a constructor until after the constructor terminates, then no code (including unsynchronized code in another thread) should be able to see that object until all of the effects of the constructor for that object can be seen, in order to maintain initialization safety. Unfortunately, premature attempts to see that object could occur in certain executions of a multithreaded program on a multiprocessor system in which a weakly consistent memory model is implemented. While violations of initialization safety may not always result in a crash of the virtual machine executing the program, incorrect computations may be obtained. This may occur despite the fact that the program would be considered “correct” in the sense that the program conforms to the standard specification of the language in which it was written.