In a computer system, memory is commonly partitioned into a multiplicity of objects, each object being a unit of memory allocation and reclamation. The memory used by an object is reclaimed when there are no references to this object. Further, there is often object-specific processing that needs to take place as part of freeing an object. For example, the object may refer to other objects and thus the object-specific processing needs to dereference any such referenced object, and further free the dereferenced object if it no longer has any references. Consequently, object destruction can potentially cause a cascade of other objects being freed and memory being reclaimed. The object-specific processing is specified by a function called a destructor in C++ and other languages. Consequently, it is common to refer to object destruction rather than memory reclamation, to capture both the object-specific reclamation-associated processing in the destructor as well as memory reclamation. How to adequately and efficiently perform object destruction within an application is a challenging problem.
In the past, programmers were expected to explicitly call a procedure, e.g., free or delete to invoke object destruction. This manual approach is error-prone and difficult, because it requires the programmer to know when there is no other reference to the object in any portion of the application. If the programmer-specified call causes an object to destruct while it still has references, a subsequent access to this freed portion of memory can cause the application to access random data, which can lead to catastrophic failure. On the other hand, if the programmer fails to cause an object to destruct when it should, the memory is typically lost for the lifetime of the application. Such memory leaks can cause the application to use an increasing amount of memory over time, often leading to failure. Moreover, an object that is not destructed in a timely manner may be holding other resources as well, such as locks, open files, and references to other objects, further wasting system resources. Finally, the manual approach can result in an object being freed a second time, also causing catastrophic program failure.
An alternative approach is to introduce a reference count variable in each object. This field is then incremented when a new reference to the object is created and decremented when a reference is removed. Then, when the reference count goes to zero, the object can be reclaimed. Languages such as C++ provide mechanisms, often referred to as smart pointer implementations, that semi-automatically implement this reference counting, thereby providing automatic reclamation of objects. One can also maintain explicit lists of the other objects referencing a given object, but this is even more costly than reference counting.
FIG. 1 is a data structure diagram illustrating an example of reference-counted objects. In this reference graph, an edge such as 101 represents a reference (e.g., a pointer, a handle, an identifier, an address, and/or another appropriate mechanism that facilitates the access of an object). For each object, the corresponding reference count value indicates the number of references to the object. For instance, object 104 has 3 references to it, namely by objects 102, 103, and 105.
The reference counting approach incurs a significant overhead because accurate reference counting of the objects needs to be maintained, and reference counting operations must be invoked each time an object is referenced or dereferenced. The overhead is particularly significant when the increment and decrement operations need to be atomic, as required in a multi-threaded environment, and thus require expensive hardware instructions such as memory fence operations. A further concern with reference counting is the unpredictable overhead that can occur on dereferencing an object. In particular, dereferencing an object that is then destructed can cause an unbounded cascade of object destructions with the associated overhead. Another concern with reference counting is that a circular reference structure can mean that a set of objects that refer to each other in a cyclic graph are not reachable from the rest of the application, yet their reference counts do not go to zero, thus the objects cannot be reclaimed.
Another approach to object destruction is to provide a so-called garbage collection (GC) mechanism that runs periodically to discover objects that no longer have any references and destruct those objects. This avoids the overhead of reference counting, but incurs a significant cost to scan memory to locate all references to each object to identify those that no longer have references.
FIG. 2 is a diagram illustrating a generational GC example. In this diagram, the memory is divided into three portions: process stack 202, newer (e.g., recently created) objects 204, and older objects 206. In some implementations, it is adequate to only scan the segment containing recently created objects in the common case because most objects lose all their references shortly after creation.
With large memories, this GC task can require processing on the order of minutes or longer to complete, substantially interfering with application responsiveness and progress. The application often requires an increased amount of memory in order to run, given that there can be a significant amount of unreferenced yet unreclaimed memory because of GC task delays. The resulting larger amount of memory increases the resource requirements and thus the cost of running the application. While techniques such as so-called generational garbage collection seek to reduce the need to review all of memory, they are dependent on certain application behaviors that can cause unpredictable performance because these behaviors change with scale. Moreover, many garbage collection techniques require memory compaction, which entails relocating many of the objects in memory, with further attendant overhead on, and inference with, application processing.
Application memory requirements are increasing, thus all of these costs are also increasing. The complexity of reclamation schemes is increasing accordingly, introducing unpredictable performance and in some cases, failures, to application execution. What is needed is an efficient way of destructing objects that is not error-prone.