Managing available memory is critically important to the performance and reliability of a computer system. Specifically, data used by a computer program is typically stored in a computer system within a memory that has a limited address space. In many computer systems, data is stored in the form of "objects" that are allocated space in a portion of the memory referred to as an "object heap". Objects also often include "references" (also known as pointers) to other objects so that a computer program can access information in one object by following a reference from another object. Typically each computer program has its own object heap, so if multiple computer programs are active in a computer system, multiple object heaps may be maintained in the system.
Whenever new data is to be used by a computer program, a portion of the free memory is reserved for that data using a process known as "allocating" memory. Given that the amount of memory available in a computer system is limited, it is important to free up, or "deallocate", the memory reserved for data that is no longer being used by the computer system. Otherwise, as available memory is used up, the performance of the computer system typically decreases, or a system failure may occur.
A computer program known as a garbage collector is often used to free up unused memory that has been allocated by other computer programs in a computer system. Often, a garbage collector executes concurrently with other computer programs to periodically scan through the object heap(s) and deallocate any memory that is allocated to unused objects (a process also known as "collecting" objects). Different computer programs that operate concurrently in a computer system often include one or more "threads" that execute concurrently with one another. Moreover, when different computer programs use different object heaps, separate garbage collector computer programs, also referred to as collector threads, may be used to manage each object heap.
One specific type of garbage collector is a concurrent mark sweep collector, which sequences repeatedly through individual collection cycles, with each cycle sequentially operating in mark and sweep stages. In the mark stage, the collector scans through an object heap beginning at its "roots", and attempts to "mark" objects that are still are reachable from a root (i.e., that are referenced directly by a root or by a chain of objects reachable from a root). In the sweep stage, the collector scans through the objects and deallocates any memory reserved for objects that are unmarked as of completion of the mark stage.
Concurrent mark sweep collectors are often very desirable for collecting unused data as they often have minimal impact on the responsiveness of program threads. However, by allowing program threads to run concurrently with a collector, a problem arises due to the fact that the program threads could interfere with the work of the collector. This interference could confuse the collector and cause the collector to collect an object that is actually reachable. If a such an object is collected, unexpected behavior may occur, possibly resulting in incorrect behavior and/or in partial or total system failure.
The process of ensuring that data accessed by one computer program in a computer system is not unpredictably affected by the operation of another computer program is generally referred to as "synchronization". Synchronization is typically not a concern for "stop-the-world" garbage collectors, as these types of collectors halt execution of all active program threads during collection, which prevents other program threads from unpredictably modifying data during collection. However, halting all program threads, even for a short time, significantly degrades system performance and degrades the responsiveness of program threads. Thus, "stop-the-world" collectors are typically not as desirable as concurrent collectors, and may not be suitable for many applications.
One specific type of object that raises potential synchronization concerns is a "weakly-reachable" object, which is an object that is reachable solely through a "weak" reference. A weak reference is a special type of reference supported by a number of programming environments, and is typically defined as a reference to an object that does not prevent the object from being collected. Objects that are reachable from a root, on the other hand, are sometimes referred to as being "strongly-reachable" to distinguish such objects from weakly-reachable objects.
Among other uses, a number of environments, e.g., the Java programming language from Sun Microsystems, support the use of weak references. An important use of Java weak references is to reference caches of data objects that are shared by a number of program threads. When used in this way, the weak references are associated with identifiers to permit a program thread to locate a shared data object by searching for a particular weak reference matching an identifier passed by the program thread.
As one example, in a customer ordering system, different program threads (e.g., ordering threads, warehousing threads, payment threads, etc.) may attempt to access a cache of customer objects using a unique identifier (e.g., a customer's phone number or social security number). References to the customer objects are defined as reference objects that include pointers to the customer objects referenced thereby. The reference objects are then arranged in a hash table that is searchable by identifier to locate the particular reference object corresponding to a particular identifier. Then, through a process known as "dereferencing", a pointer to the particular customer object itself may be returned to a program thread attempting to locate such an object so that the data stored within the customer object may be accessed directly using the pointer. Through this arrangement, the shared customer objects may be located and accessed in a fast and efficient manner.
Since conventional garbage collectors do not ordinarily collect reachable objects, a generic reference (also known as a "strong" reference) from a reference object to a shared data object would typically prevent the shared data object from being collected--even if the shared data object was only reachable through the reference object. However, by making such a reference a weak reference, a garbage collector may be permitted to collect a shared data object that is reachable solely through the reference object, consequently permitting a greater number of unused data objects to be collected by a garbage collector.
One concern that arises with the use of weak references, however, is that a program thread often must take special precautions when accessing an object through a weak reference. Otherwise, a program thread could attempt to access a weak reference to obtain the pointer to the object referenced thereby, and then have the weak reference later cleared by the collector, and the object referenced thereby collected. The pointer obtained by that program thread would then point to a non-existent object, and any attempt to use that pointer could result in unpredictable behavior by the computer system, or even partial or total system failure. As a result, access to weak references must be carefully synchronized with collection to ensure correctness in a computer system.
In addition, in many environments, it is desirable to clear weak references automatically by the collector after strong-reachability is lost to referenced objects. To clear these references, it is important that all weak references from which an object is weakly-reachable be cleared in a step that is simultaneous from the point of view of any program threads executing in a computer system. Otherwise, this could lead to multiple copies of the weakly-referenced object, and possibly unpredictable behavior in the computer system.
Synchronizing access to weak references in a conventional concurrent mark sweep collector could effectively turn the collector into a stop-the-world collector, however, as all program threads that try to follow weak references would need to be stopped while the collector processes the weak references. Halting such program threads, even for a short time, could significantly degrade system performance.
Therefore, a significant need exists for an improved manner of efficiently collecting weakly-reachable objects with a minimal impact on system performance.