Virtual machines are abstract computers for which application software can be compiled. The virtual machine is thus an abstraction level for application software that is consistent between different hardware and operating system combinations. Most of the complexity in running the same application on different platforms is handled by the virtual machine and therefore the virtual machine becomes a very complex piece of software. Modern virtual machines need to manage code generation for the particular processor, operating system dependent resources like threads, networking and the file system. The virtual machine also manages the heap, within which allocation and freeing of virtual machine objects is performed. Examples of such virtual machines include the Java Virtual Machine (JVM) and implementations thereof, including the JRockit JVM from BEA Systems Inc., and the Hotspot JVM from Sun Microsystems, Inc.
The definition of the Java Virtual Machine (JVM) does not specify any requirements on the performance or the behaviour of the garbage collection process apart from basic assumptions such as: unused memory should be reused for new objects, and finalizers should be called when objects are to be released. The exact details are explained in the book “The Java™ Virtual Machine Specification (2nd Edition)” by Tim Lindholm published by Sun, and incorporated herein by reference. The JVM implementor can therefore choose to optimize different kinds of behaviours depending on the requirements of the application software and the features of the particular hardware used. A perfect garbage collector would be undetectable to the application software and the software user, there would be no pauses, no extra CPU or memory consumption. Unfortunately no such garbage collector exists and a lot of work has been invested into achieving high performance object allocation and garbage collection with different algorithms for different goals.
Two of the more important problems to solve within garbage collection is to lower the pause times and to increase pause predictability. Pause times include both stop-the-world times where all threads are stopped simultaneously while the garbage collector performs some work, and pause times for each thread separately. Stop-the-world pauses are more disruptive to application software than separate thread pauses. However the sum of all pauses must be limited to allow the application to perform efficiently. For many applications pause predictability is more important than efficiency. Efficiency, to a certain limit can be achieved by purchasing more powerful hardware, but predictable pause times cannot simply be achieved by providing faster hardware.
Object allocation is the companion problem to garbage collection. To avoid locking bottlenecks the standard solution is to give each thread its own thread local area (TLA) on the heap where allocation is performed by pointer bumping. When the TLA is used up, a global free list lock is needed to secure a new TLA for the thread. Since a TLA is simply an area on the heap where only a single thread is allowed to allocate, the objects allocated are immediately eligible for garbage collection if necessary.
Since stop-the-world pauses are undesirable, much work has been spent on ways of splitting the garbage collector work into manageable units, where each unit of work incurs a short pause time, especially that work which requires a stop-the-world pause. Examples of such solutions are concurrent garbage collectors, generational garbage collectors and thread local garbage collectors.
The concurrent garbage collector performs as much as possible of the garbage collecting process in parallel with the software application. To do this the JVM needs to trap all updates to pointers while the garbage collector is running. This is called a “write barrier”, and costs cpu-time. The concurrent garbage collector is therefore used when short pause times are more important than efficiency.
The generational garbage collectors allocate objects within a nursery heap. Objects surviving the nursery collection are assumed to be long-lived objects and therefore moved to the old space on the heap which is collected more seldom. The increase in efficiency is based on the assumption that objects die young and it is faster for the garbage collector to collect the small nursery heap to avoid a full collect of the larger old space heap. The generational garbage collector also needs write barriers.
The thread local garbage collector splits the heap into one large global heap and one small local heap for each thread in such a way that each thread local heap can be garbage collected separately from the other thread local heaps. Thread local heaps can potentially increase efficiency both by avoiding collecting the global heap and by lowering the pause times for each thread and to reduce the number of stop-the-world pauses. U.S. Pat. No. 6,912,553 (Kolodner, et al.) teaches a thread local heap collector that traps each update to object pointers in such a way that any object that can be accessed by a thread different from the given thread, will be moved to the global heap. The traps are implemented as software write barriers generated for JVM byte code instructions putfield, putstatic and aastore. Unfortunately, in current thread local heap implementations the gain in garbage collector performance is lost due to the write barriers needed and to the cost of moving objects from the local heap to the global heap.