The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
In virtual machines, synchronization is the process of governing the access of multiple threads to a shared object. For example, during the execution of Java applications in a Java Virtual Machine (JVM), a thread may synchronize on a shared object by obtaining a lock on the object. By obtaining the lock, the thread ensures that, while it is operating on the object or a resource associated with the object, the object or the associated resource will not be modified by another thread, as long as all threads attempt to obtain the lock before making any modifications. This helps to ensure data consistency and integrity.
In one traditional locking approach for a JVM, a thread obtains a lock on an object by invoking a locking function of the JVM. The locking function, which is now being executed by the thread, creates a heavy-weight lock (HWL) data structure, and associates the HWL data structure with the object that is being locked. In addition, the locking function calls down to the operating system (OS) and requests an OS-level locking structure, such as, for example, a mutex. After the mutex is obtained and associated with the HWL data structure, the locking function calls down to the OS again to obtain ownership of the mutex. Once that is done, the thread owns a lock on the object and no other thread will be allowed to lock the object until the thread releases the mutex. When another thread attempts to lock the object (i.e. contends for the lock on the object), the contending thread executes the locking function, which calls down to the OS. The OS determines that a mutex on the object is already granted and blocks the contending thread. The OS unblocks the contending thread after the mutex on the object has been released, and at this point the contending thread can obtain ownership of the mutex to lock the object.
In the above traditional locking approach, the creation of a HWL data structure and the setup of a OS-level mutex is relatively resource intensive. It has been observed that, in a majority of cases in which a lock is obtained on an object, no locking contention actually occurs. That is, a thread obtains the lock and releases the lock on the object before any other thread tries to obtain a lock on that object. Thus, in most cases, the HWL data structure and the mutex are not used, and the locking overhead is incurred needlessly. In light of this observation, some JVM's have been enhanced to implement a fast locking approach. According to this approach, a JVM does not create a HWL data structure each time an object is locked. Rather, the JVM utilizes a light-weight, fast lock (FL) data structure, which is much less resource intensive to obtain and initialize than the HWL data structure. Only when there is actual locking contention will the JVM create the HWL data structure and request a mutex from the OS.
One example implementation of the fast locking approach may be as follows. When a first thread desires a lock on an object, it invokes the locking function of the JVM. The locking function (which is now being executed by the first thread), detects that this is the first request to lock the object; hence, the locking function obtains and initializes an FL data structure and associates it with the object. The locking function does not create an HWL data structure, nor does it call down to the OS to obtain a mutex. If the first thread releases the lock on the object before any other thread tries to lock that same object, then the locking function simply destroys the FL data structure, and the HWL data structure is never created.
If, however, a second thread invokes the locking function of the JVM to lock the object, the locking function (which is now being executed by the second thread) detects that the FL data structure has already been obtained and initialized by the first thread which has already locked the object. Thus, the locking function determines that there is lock contention for the object. In response, the locking function creates an HWL data structure and calls down to the OS to request a mutex. After the mutex is obtained and associated with the HWL data structure, the locking function calls the OS on behalf of the first thread and causes ownership of the mutex to be associated with the first thread. After the first thread obtains ownership of the mutex, the HWL data structure is associated with the object; thus, the first thread now owns an actual lock on the object. Thereafter, the locking function calls down to the OS again and tries to lock on the mutex, this time on behalf of the second thread. Because the mutex is now owned by the first thread, the second thread cannot obtain the mutex. As a result, the second thread blocks and waits. The OS unblocks the second thread at some point after the mutex is released by the first thread. At that point, the second thread will be allowed to obtain ownership of the mutex and an actual lock on the object. In this manner, the fast locking approach provides that the JVM creates an HWL data structure and requests a mutex from the OS only when there is actual locking contention.
However, even though the fast locking approach provides for avoiding the overhead associated with creating a HWL data structure and obtaining an OS-level mutex, the fast locking approach is still resource-expensive since it still requires the execution of a few dozen extra instructions in order to obtain, initialize, and then release the FL data structure. This overhead caused by the fast locking approach is particularly apparent in cases where a thread executes a fairly trivial synchronized method.
For example, consider the following Java “Counter” class and the synchronized “increment( )” method declared therein:
class Counter {                public int count;        public synchronized void increment( ){                    count=count+1;                        }        
}
A thread, which has instantiated an object of the “Counter” class, needs to execute only three instructions to increment the public variable “count”. However, since the “increment( )” method is declared with the “synchronized” keyword and thus must be synchronized, a few dozen extra instructions must be executed to implement fast locking when the method is called from the thread. In some JVM implementations, the thread needs to execute 60-80 additional instructions when it calls the method in order to provide for proper locking by using a FL data structure and an additional 50-60 instructions for proper unlocking after the method is executed. Thus, even though the “increment( )” method is trivial and requires only three instructions, the fast locking approach would require the execution of additional 110-140 instructions to implement the required synchronization. In this manner, the fast locking approach introduces a significant overhead when it is used to synchronize methods that are fairly simple.
Based on the foregoing, there is a clear need for techniques for executing simple synchronized methods with locking overhead that is less than the overhead caused by the traditional and fast locking approaches described above.