1. Field of the Invention
This invention relates to computer systems, and more particularly to the management of thread-local data.
2. Description of the Related Art
Objects
In some systems, which are usually known as “object oriented,” objects may have associated methods, which are routines that can be invoked by reference to the object. Objects may belong to a class, which is an organizational entity that may contain method code or other information shared by all objects belonging to that class. However, the term “object” may not be limited to such structures, but may additionally include structures with which methods and classes are not associated. More generally, the term object may be used to refer to a data structure represented in a computer system's memory. Other terms sometimes used for the same concept are record and structure. An object may be identified by a reference, a relatively small amount of information that can be used to access the object. A reference can be represented as a “pointer” or a “machine address,” which may require, for instance, sixteen, thirty-two, or sixty-four bits of information, although there are other ways to represent a reference.
Threads
Computer systems typically provide for various types of concurrent operation. A user of a typical desktop computer, for instance, may be simultaneously employing a word-processor program and an e-mail program together with a calculator program. A computer may one processor or several simultaneously operating processors, each of which may be operating on a different program. For computers with a single main processor, operating-system software typically causes that processor to switch from one program to another rapidly enough that the user cannot usually tell that the different programs are not really executing simultaneously. The different running programs are usually referred to as “processes” in this connection, and the change from one process to another is said to involve a “context switch.” In a context switch one process is interrupted, and the contents of the program counter, call stacks, and various registers are stored, including those used for memory mapping. Then the corresponding values previously stored for a previously interrupted process are loaded, and execution resumes for that process. Processor hardware and operating system software typically have special provisions for performing such context switches.
A program running as a computer system process may take advantage of such provisions to provide separate, concurrent “threads” of its own execution. Switching threads is similar to switching processes: the current contents of the program counter and various register contents for one thread are stored and replaced with values previously stored for a different thread. But a thread change does not involve changing the memory mapping values, as a process change does, so the new thread of execution may have access to the same process-specific physical memory as the same process's previous thread.
In some cases, the use of multiple execution threads is merely a matter of programming convenience. For example, compilers for various programming languages, such as the Java™ programming language, readily provide the “housekeeping” for spawning different threads, so the programmer is not burdened with all the details of making different threads' execution appear simultaneous. (Java is a trademark or registered trademark of Sun Microsystems, Inc., in the United States and other countries.) In the case of multiprocessor systems, the use of multiple threads may provide speed advantages. A process may be performed more quickly if the system allocates different threads to different processors when processor capacity is available. To take advantage of this fact, programmers may identify constituent operations within their programs that particularly lend themselves to parallel execution. When a program reaches a point in its execution at which the parallel-execution operation can begin, the program may start different execution threads to perform different tasks within that operation.
Thread-Local Heaps
Some conventional memory management schemes for multithreaded applications may partition memory space (e.g., a heap, such as a Java™ heap) used by a process into thread-local heaps (with one thread-local heap for each thread) and a global, or shared, heap. One approach to thread-local heaps is described in a paper by Domani et al. in the Proceedings of the 2002 International Workshops on Memory Management (ISMM) entitled “Thread-Local Heaps For Java.”
FIG. 1 illustrates an exemplary application with multiple threads in which the heap has been segmented into thread-local heaps and a global memory area. In FIG. 1, application 100 may implement one or more threads 120. Heap 110 may be partitioned to provide a thread-local heap 112 for each thread and a global heap. Thread-local objects 116 may be allocated in the thread-local heaps 112. Global objects 114 may be allocated in the global heap partition. Note that a local object 116 may be pure (i.e., may only reference other local objects 116) or may be impure (i.e., may reference either local objects or global objects).
Among other advantages, partitioning the heap 110 as illustrated in FIG. 1 and allowing threads 120 to allocate thread-local data in thread-specific heaps, thread-local data may be garbage collected on a per-thread basis, reducing the need for global garbage collection across all threads 120. Another advantage of partitioning the heap 110 as illustrated in FIG. 1 is that this memory management technique may reduce or eliminate the need for synchronization among threads 120, for example when the threads access thread-local data, and may generally allow for a “weaker” memory mode.
In memory management schemes for multithreaded processes that partition memory (e.g., a heap) to provide thread-local heaps, a mechanism may be implemented to identify thread-local data, more specifically to distinguish thread-local objects 116 from global objects 114. Conventional memory management schemes may use static or dynamic techniques to identify thread-local objects. The paper by Domani, et al. referenced above presents a dynamic technique for identifying thread-local objects, and also reviews conventional static techniques. These conventional techniques for identifying thread-local objects generally either rely on the fact that the address space (e.g., the heap) is partitioned and only allocate thread-local objects in particular partitions of the address space and therefore use some form of address range check to determine whether an object is thread-local or global, or alternatively use a bit or field in the data structure of the object itself that may be checked to determine if the object is thread-local or global. For example, in the dynamic technique presented by Domani, et al., a bit is set in each local object that may be checked in a write-barrier.
Conventional techniques for identifying thread-local objects that rely on some sort of address-range check to identify objects as thread-local or global necessarily limit the allocation of thread-local objects to the thread-local heaps. Conventional techniques for identifying thread-local objects that rely on a bit or field within the object itself to identify objects as thread-local or global require a load of an object and a check of the bit or field to determine whether the object in question is thread-local or global.