This invention relates to object-oriented programming and particularly to an object""s method execution during run-time.
An object-oriented program can keep track of whether or not an object is thread-local. This can be tracked dynamically using a write barrier, as described in detail in U.S. patent application Ser. No. 09/356,532 by Trotter and Kolodner on thread-local heaps.
A global object is an object that can be accessed by more than one thread. A thread-local object is an object that can be accessed by a single thread only. Similar definitions apply to global roots and local roots (e.g., references in the registers and the stack of a thread). The write barrier is used for stores of references into global roots and into objects, in the following manner:
(a) An indicator is associated with object to show whether it is global. If the indicator is set, the object is global; otherwise, it is local.
(b) Before a reference is assigned to a global variable:
i) If the referenced object is local, then trace the sub-graph of objects rooted at the referenced object and mark every object in that sub-graph global.
ii) Do the assignment.
(c) Before a reference is assigned to a field of an object whose global indicator is set:
i) If the referenced object is local, then trace the sub-graph of objects rooted at the referenced object and mark every object in that sub-graph global.
ii) Do the assignment.
Thread-locality is a property that can also be shown to hold statically, e.g., by a compile-time analysis. See, for example, Choi, Gupta, et. al., xe2x80x9cEscape Analysis for Javaxe2x80x9d in OOPSLA 99, 11/99.
The present invention describes opportunities to exploit the thread-local property of an object in order to reduce the computing overhead on that object, and methods for exploiting those opportunities. The observation that thread-locality can be exploited to reduce computing overhead when it is tracked dynamically is new.
We also describe additional opportunities for exploiting thread-locality that have not been previously known (whether the locality property is obtained from a static analysis or tracked dynamically).
The thread-local property of the objects can be exploited in order to reduce the cost of other operations on these objects. In particular, synchronization costs on thread local objects can be avoided. In the implementation of Java, such savings can be significant.
When objects are used to implement re-useable components, e.g., a hash table, a sparse matrix, etc., good programming practice often leads to the implementations being thread-safe, i.e., supporting synchronization in the event that two or more threads attempt to use the object concurrently. The particular means of implementing the synchronization is language dependent. If the object is not actually used by more than one thread, the cost of synchronization is incurred with no actual run-time benefit.
An example of such re-useable components can be seen in Java, where many classes are thread-safe, i.e., they are implemented in such a way that their operation will be correct even if their instances are accessed simultaneously by multiple threads. These thread-safe classes are used in many cases where a non thread-safe equivalent would be safe to use. It is easier and safer to use the thread-safe classes than for the programmer to do the analysis that a non-thread safe version could be used. There are also cases where an object may not need to be thread-safe at one point in its lifetime, but may need to be thread-safe at a later point. Furthermore, the advantages of object-orientation and re-use can be better realized by using the thread-safe classes.
The mechanisms provided in Java to ensure thread-safe access are synchronized methods and synchronized statement. These synchronization mechanisms are implemented using monitors. Monitors are a language-level construct for providing mutually exclusive access to shared data structures in a multi-threaded environment. A low overhead locking scheme for Java is described by David F. Bacon et al. in xe2x80x9cThin Locks: Featherweight Synchronization for Javaxe2x80x9d appearing in the Proceedings of the ACM Conference on Programming Language and Design and Implementation, SIGPLAN Notices volume 33, number 6, Jun. 1998. As explained in this article, in Java the methods of an object may be declared synchronized meaning that the object must be locked for the duration of the method""s execution. Such locking imposes an overhead, which is actually wasted in the event that the object is thread local and can be accessed only by a single thread.
Some experiments suggest that in fact on typical benchmarks and applications, more than 50% of all monitor operations can be seen to be carried out on thread local objects. It is believed that similar results may be seen in other languages.
These experiments indicate the need to provide a mechanism for exploiting the knowledge that an object is local to a particular thread and can be accessed solely by that thread in order to avoid unnecessary computation overhead.
It is therefore an object of the invention to provide a mechanism for exploiting the knowledge that an object is local to a particular thread and can be accessed solely by that thread in order to avoid unnecessary computation overhead.
According to a broad aspect of the invention there is provided a computer-implemented method for reducing a computing overhead associated with an object based on whether or not it is local to a particular thread and can be accessed solely by that thread, comprising the steps of:
(a) dynamically tracking the object during run-time so as to derive information as to whether or not the object is local to a particular thread and can be accessed solely by that thread, and
(b) using said information to reduce a computing overhead associated with said object.
Such a method finds particular application in Java for objects that are determined to be thread-local, where there are very precise rules as to when modifications to an object must be written back to xe2x80x9cmain memoryxe2x80x9d such that they can be made visible to other threads. These rules require all updates to be written back to the heap whenever a lock is released. This precludes keeping updates only in thread local storage, such as registers, when a lock is released. However, if it is known that an object is thread local, then it is known that no other thread can access the object and the language""s semantics will not be violated by keeping modified values in registers without writing them back to main memory. This can be done using the smart code mechanism described in U.S. patent Ser. No. 09/317,421 to Factor et al.
For example, if it is known that only one thread is accessing the object, it is possible to ignore the effects of weak consistency architectures (e.g., the fact that one processor may see the results of writes in an order different than the one in which they were executed). This too can be done using the smart code mechanism described in U.S. patent Ser. No. 09/317,421 to Factor et al.
The method according to the invention finds particular application for improving cache locality for thread-local objects by:
i) storing objects in cache lines each of which is associated with a respective thread, and
ii) storing objects that are accessed by more than one thread in distinct cache lines.
The method according to the invention also finds particular application for detecting deadlock. For example, in a Java application deadlock can be detected by determining that a thread waits indefinitely for an object that is thread-local since it is impossible for another thread to notify the waiting thread.
The method according to the invention also finds particular application for avoiding synchronization costs on thread-local objects. For example, when used with a language/system that uses monitors, the method includes:
i) maintaining a count of a number of entries by said monitor,
ii) obtaining a monitor lock on a thread local object without synchronization,
iii) upon entering the monitor, incrementing the count without synchronization, and
iv) upon exiting the monitor, decrementing the count without synchronization.