1. Field of Invention
The present invention relates generally to methods and apparatus for improving the performance of software applications. More particularly, the present invention relates to methods and apparatus for reducing the overhead associated with obtaining a lock on an object.
2. Description of the Related Art
In object-based computing systems, objects are generally operated on by threads. An object typically includes a set of operations and a state that remembers the effect of the operations. Since an object has some memory capability, an object differs from a function, which has substantially no memory capability. A thread, as will be understood by those skilled in the art, may be thought of as a xe2x80x9csketch padxe2x80x9d of storage resources, and is essentially a single sequential flow of control within a computer program. In general, a thread, or a xe2x80x9cthread of control,xe2x80x9d is a sequence of central processing unit (CPU) instructions or programming language statements that may be independently executed. Each thread has its own execution stack on which method activations reside.
During the execution of an object-based program, multiple threads may attempt to execute operations which involve a single object. In other words, more than one thread may attempt to operate on a single object. Frequently, only one thread is allowed to invoke one of some number of operations, i.e., synchronized operations, that involve a particular object at any given time. A synchronized operation, e.g., a synchronized method, is block-structured in that it requires that the thread invoking the method first synchronize with the object that the method is invoked on, and desynchronize with that object when the method returns. Synchronizing a thread with an object generally entails controlling access to the object using a synchronization construct before invoking the method.
Since a thread, e.g., a concurrent thread as will be appreciated by those skilled in the art, is not able to predict when it will be forced to relinquish control, synchronization constructs such as locks, mutexes, semaphores, and monitors may be used to control access to shared resources during periods in which allowing a thread to operate on shared resources would be inappropriate. By way of example, in order to prevent more than one thread from operating on an object at any particular time, objects are often provided with locks. The locks are arranged such that only the thread that has possession of the lock for an object is permitted to execute a method on that object.
FIG. 1 is a diagrammatic representation of an object which is provided with a lock and two threads that both require access to the object. An object 102 includes a header field 106 and a word 110 which is arranged to indicate if object 102 is locked, or owned by a thread 114. As shown, object 102 is locked by thread 114a. Accordingly, word 110 identifies thread 114a as owning object 102. Another word 118 is stored in a stack frame 122, e.g., stack frame 122a, of a stack 126a that is associated with thread 114a. Word 118 is arranged to indicate that thread 114a has possession of the lock on object 102 and is, therefore, allowed to operate on object 102.
When thread 114b, which has an associated stack 126b, requires access to object 102, thread 114b reads word 110 and determines whether object 102 is available. When object 102 is locked by thread 114a, thread 114b may not obtain the lock on object 102 until thread 114a has relinquished the lock. In other words, until word 110 indicates that no thread owns object 102, thread 114b may not obtain the lock on object 102.
In general, thread 114b may either repeatedly attempt to lock object 102, or thread 114b may effectively xe2x80x9csleepxe2x80x9d until it is notified that object 102 is available. With reference to FIG. 2, the steps associated with the acquisition of an object lock by a thread will be described. A process 202 of acquiring an object lock begins at step 204 in which a thread, e.g., thread 114b of FIG. 1, attempts to lock an object, e.g., object 102 of FIG. 1. In attempting to lock an object or, more generally, in attempting to acquire the ownership of an object, the thread may study the object to determine if the object is locked. By way of example, as discussed above with respect to FIG. 1, the thread may read a specific word stored in the object to determine if the object is available to the thread. When the object is available, the thread may update the specific word to indicate that it has locked, or acquired ownership of, the object.
A determination is made in step 208 as to whether the attempt by the thread to lock the object was successful. If the determination is that the attempt was successful, then the thread has the object lock, and the process of locking the object is completed. Alternatively, when it is determined that the attempt to lock the object was not successful, the indication is that the object is locked by another thread. As such, the thread typically must wait for the other thread to relinquish the object lock before the thread may lock the object.
When the attempt to lock the object was not successful, then process flow proceeds to step 212 where it is determined if the thread is coded to spin or to block when awaiting the availability of the object. When a thread is coded to spin, the thread will periodically check the object to determine if the lock on the object is available. Spinning, or busy-waiting for a resource such as an object to be freed, avoids thread context switches, as will be appreciated by those skilled in the art. Alternatively, when a thread is coded to block, the thread effectively puts itself into a sleep state during which the thread does not attempt to access the object.
If the determination in step 212 is that the thread is coded to spin, then the thread spins for a given period of time in step 216. The given period of time is typically specified by the overall computing system, and is considered to be one xe2x80x9cspin cycle.xe2x80x9d After the thread spins for one spin cycle, process flow returns to step 204 where the thread once again attempts to lock the object.
Alternatively, when it is determined in step 212 that the thread is coded to block, the thread blocks itself in step 220. As will be appreciated by those skilled in the art, computing systems generally specify a maximum number of spin cycles. In some systems, when a thread has spun for the maximum number of spin cycles without successfully locking an object, the thread may then block itself. That is, in some systems, a thread may be coded to first spin, then eventually block if the object lock has not been successfully obtained through spinning.
While the thread is blocked, the thread awaits notification, typically from an operating system, that the object is available for locking. In step 224, the thread receives notification that the object is available for locking. Accordingly, the thread unblocks itself and process flow returns to step 204 in which the thread once again attempts to lock the object.
Blocking a thread, or putting a thread to sleep such that the thread is effectively not executing, while waiting for an object to be freed is computationally less expensive than allowing a thread to spin if it is expected to take a significant amount of time for the object to become free. However, since blocking a thread typically requires context switches, if an object is expected to be freed in a relatively short amount of time, then allowing the thread to spin may be more efficient from a performance point of view. As will be understood by those skilled in the art, a context switch entails allowing another thread to execute on a particular central processing unit (CPU) if another thread is available. In general, the choice of whether to block a thread or to allow the thread to spin is made on a system-wide basis. That is, either all threads in a system block, or all threads in the system spin.
Since not all threads and objects in a system are typically characterized by the same behavior, e.g., not all threads benefit from spinning, having all threads either block or spin is likely to be inefficient. By way of example, in a system where all threads spin, a particular thread may continue spinning for a significant amount of time. For such a thread, continually spinning may be inefficient, as blocking such a thread would allow system resources to be better allocated for other purposes.
To prevent threads from spinning continually while failing to acquire ownership of an object, some systems allow a thread to spin only for up to a maximum number of spins, at which point the thread is blocked. For such a system in which a thread is allowed to spin for a maximum number of times specified within a system and then block, spinning may still be inefficient when the thread is eventually forced to block. That is, allowing a thread which eventually blocks itself to first spin repeatedly is effectively a waste of system resources. The repeated attempts to lock an object during an overall spinning process often proves to be expensive, and have the tendency to adversely affect program performance by utilizing system resources which may be used elsewhere.
Therefore, what is desired is a method for reducing the cost associated with attempts to lock an object. That is, what is needed is a method for efficiently determining when a thread should spin and when a thread should block while awaiting the availability of an object.
The present invention relates to a method for enabling a thread to spin for a substantially optimal period of time while attempting to acquire an object lock before allowing the thread to enter a blocking state. According to one aspect of the present invention, a method for acquiring ownership of an object in an object-based environment using a current thread includes determining when the object is owned by another thread, and locking the object when it is determined that the object is not owned by the another thread. A first spinning process, which is implemented when it is determined that the object is owned by the another thread, is arranged such that the current thread spins for up to a predetermined number of spin cycles associated with the current thread and the object. The predetermined number of spin cycles is determined using historical information, and is not based upon an overall system specification. When it is determined that the object has not been locked by the current thread during the first spinning process, a first blocking process is implemented. Allowing a thread to spin for a predetermined amount of times then block, if necessary, enables resources allocated for locking to be efficiently used, thereby improving the performance of the overall system.
In one embodiment, it is determined if the current thread is in an information gathering position. The first spinning process is implemented when it is determined that the object is owned by another thread and the current thread is not in the information gathering position. In such an embodiment, when it is determined that the current thread is in the information gathering position, a second spinning process is implemented. During the second spinning process, the current thread spins for up to a maximum number of spin cycles that is specified by the object-based environment. After the second spinning process, historical information associated with the second spinning process is stored.
According to another aspect of the present invention, an object-based computing environment includes at least one processor and a first thread that has possession of a locking mechanism associated with an object. The computing environment also includes a current thread. The current thread is arranged to determine when the locking mechanism is possessed by the first thread, and is also arranged to implement a first spinning process when it is determined that the locking mechanism is possessed by the first thread. The first spinning process is arranged such that the current thread spins for up to a predetermined number of spin cycles that is determined using historical information associated with the current thread and the object, the predetermined number of spin cycles being determined using historical information.
In accordance with still another aspect of the present invention, a method for acquiring ownership of an object in an object-based system includes obtaining historical spinning information. The historical spinning information includes data associated with a number of times a current thread has previously spun while attempting to acquire ownership of the object. The method also includes reducing the historical spinning information to determine a suitable number of spin cycles, and spinning the current thread for up to the suitable number of spin cycles. Between spin cycles, the current thread attempts to acquire ownership of the object. Finally, the method includes determining when the current thread has acquired ownership of the object blocking the current thread when it is determined that the current thread has not acquired ownership of the object. In one embodiment, the suitable number of spin cycles is indicative of a substantially maximum amount the current thread spins to achieve a predetermined level of probability that the current thread acquires ownership of the object during spinning.
These and other advantages of the present invention will become apparent upon reading the following detailed descriptions and studying the various figures of the drawings.