1. Field of the Invention
The present invention relates to compilers programming languages and, in particular, to compiling code to optimize repetitive synchronization.
2. Description of the Related Art
The Java™ programming language is recognized to provide many benefits to the programmer. Not the least of these benefits relate to the handling of error conditions, support for multiple threads (to be defined hereinafter) and platform independence. Java is a trademark of Sun Microsystems, Inc.
A defined unit of programming code, developed for a particular purpose, has been called a function, a subroutine and a procedure in different programming languages. In the Java programming language, such a unit is called a “method”.
Java includes provisions for handling unusual error conditions. The provisions are included in the form of “exceptions”, which allow unusual error conditions to be handled without including “if” statements to deal with every possible error condition.
Java also includes provisions for multiple executions streams running in parallel. Such executions streams are called “threads”.
One of the desirable qualities of Java, is the ability of Java code to be executed on a wide range of computing platforms, where each of the computing platforms in the range can run a Java virtual machine. However, there remains a requirement that the code run on a particular platform be native to that platform. A just-in-time (J IT) Java compiler is typically in place, as part of an environment in which Java code is to be executed, to convert Java byte code into native code for the platform on which the code is to be executed.
Recent JIT compilers have improved features, such as “method inlining” and “synchronization”.
Where a method is called that does something trivial, like add one to an argument and then return the argument, Java programmers have, in the past, been tempted to simply insert the instruction rather than call the method. This act is called “manual method inlining” Such manual method inlining can improve the speed at which the code runs, as the overhead of an instruction that jumps to the called method, as well as the return from the method, is saved. Some JIT compilers have the capability to recognize where method inlining will improve the speed at which code runs and, thus, can automatically inline methods while converting the byte code into native code.
From Venners, Bill, “How the Java virtual machine performs thread synchronization”, http://www.javaworld.com/javaworld/jw-07-1997/jw-07-hood.html, in the Java virtual machine (JVM), each thread is awarded a Java stack, which contains data no other thread can access. If multiple threads need to use the same objects or class variables concurrently, the access of the threads to the data must be properly managed. Otherwise, the program will have unpredictable behavior.
To coordinate shared data access among multiple threads, the Java virtual machine associates a lock with each object. A thread needing to lock a particular object, communicates this requirement to the JVM. The JVM may then provide the lock to the thread. When the thread no longer requires the lock, the thread communicates this lack of requirement to the JVM. If a second thread has requested the same lock, the JVM provides the lock to the second thread.
A single thread is allowed to lock the same object multiple times. For each object, the JVM maintains a count of the number of times the object has been locked. An unlocked object has a count of zero. When a thread acquires the lock for the first time, the count is incremented to one. Each time the thread acquires a lock on the same object, the count is incremented. Each time the thread releases the lock, the count is decremented. When the count reaches zero, the lock is released and made available to other threads.
In Java language terminology, the coordination of multiple threads that must access shared data is called synchronization. The Java language provides two built-in ways to synchronize access to data: with synchronized statements or synchronized methods.
Typically, two bytecodes, namely “monitorenter” and “monitorexit”, are used for synchronization of blocks within methods. That is, when synchronization operations are required to synchronize a given block, the Java programmer places a monitorenter bytecode before the given block and a monitorexit bytecode after the given block. When the code inserted by the JIT compiler to perform the monitorenter bytecode is encountered by the Java virtual machine, the Java virtual machine acquires the lock for the object referred to by a reference to the object on the stack. If the thread already owns the lock for that object, a count is incremented. Each time the code inserted by the JIT compiler to perform the monitorexit bytecode is executed for the thread on the object, the count is decremented. When the count reaches zero, the lock is released.
In multi-threaded Java programs, synchronization between threads is usually necessary to ensure correct execution. The Java language provides for synchronized methods and synchronized blocks to enable Java programmers to indicate particular sections of code and particular objects that require synchronization between threads to ensure correctness. As a Java JIT compiler optimizes the execution of a Java program, the Java JIT compiler often inlines synchronized methods that are invoked by the method being compiled or by methods that contain synchronized blocks. Because many classes in the Java class library are designed to be safe for use in a multithreaded program, it is common for programmers to write Java programs that execute synchronization primitives without being aware that they are doing so and certainly without expecting any performance degradation due to repeated synchronization.
Aggressive inlining often results in methods that can require repeated locking and unlocking of the same object (say, object 1 or “O1”), either in a nested fashion (e.g., lock O1 . . . lock O1 . . . unlock O1 . . . unlock O1), or in a sequential fashion (e.g., lock O1 . . . unlock O1 . . . lock O1 . . . unlock O1). The acquisition and release of locks is known to cause execution time to increase over programs using non-synchronized methods. As such, programmers typically look for optimizations. In the first (nested) case, the inner lock and unlock operations can be safely removed, so long as the memory barrier actions dictated by the Java Memory Model occur. In the second (sequential) case, the middle unlock and lock operations can be removed to “coarsen” the synchronized regions in the method.
A more complex example of lock coarsening may be considered in view of five sequential blocks of code: a first block, a second block, a third block, a fourth block and a fifth block. An object may be locked and unlocked within the first block, the second block, the fourth block and the fifth block, while the object is not locked in the third block. Where one strategy of lock coarsening is applied, only one lock and unlock operation is performed. In particular, the object is locked in the first block and unlocked in the fifth block. Note, however, that the object is now locked for the third block, for which the object originally remained unlocked.
We say that the four locked blocks have been coarsened together. This coarsening may improve performance by reducing the number of lock and unlock operations executed by a method, but it can also degrade performance by holding locks longer, since contention (when different threads contend for a lock on a particular object) may increase which can reduce the amount of parallelism exploited by multiple Java threads.
Clearly, improved methods of lock coarsening are required.