The invention is generally related to computers and computer software. More specifically, the invention is generally related to the management of objects in an object heap.
Managing available memory is critically important to the performance and reliability of a data processing system such as a computer. Specifically, data used by a computer program is typically stored in a computer within a memory that has a limited address space. In many computers, data is stored in the form of xe2x80x9cobjectsxe2x80x9d that are allocated space in a portion of the memory referred to as an xe2x80x9cobject heapxe2x80x9d. Objects also often include xe2x80x9creferencesxe2x80x9d (also known as pointers) to other objects so that a computer program can access information in one object by following a reference from another object. Typically each computer program has its own object heap, so if multiple computer programs are active in a computer, multiple object heaps may be maintained in the computer.
Whenever new data is to be used by a computer program, a portion of the free memory is reserved for that data using a process known as xe2x80x9callocatingxe2x80x9d memory. Given that the amount of memory available in a computer is limited, it is important to free up, or xe2x80x9cdeallocatexe2x80x9d, the memory reserved for data that is no longer being used by the computer. Otherwise, as available memory is used up, the performance of the computer typically decreases, or a system failure may occur.
A computer program known as a garbage collector is often used to free up unused memory that has been allocated by other computer programs in a computer. Often, a garbage collector executes concurrently with other computer programs to periodically scan through the object heap(s) and deallocate any memory that is allocated to unused objects (a process also known as xe2x80x9ccollectingxe2x80x9d objects). Different computer programs that operate concurrently in a computer often include one or more xe2x80x9cthreadsxe2x80x9d that execute concurrently with one another. Moreover, when different computer programs use different object heaps, separate garbage collector computer programs, also referred to as collector threads, may be used to manage each object heap.
One specific type of garbage collector is a copying garbage collector, which manages an object heap by partitioning the object heap into xe2x80x9cfrom-spacexe2x80x9d and xe2x80x9cto-spacexe2x80x9d partitions, and copying valid and currently-used objects from the from-space to the to-space, in effect leaving unused objects behind. A specific implementation of a copying garbage collector is a generational garbage collector, which partitions an object heap into new and old partitions. A generational garbage collector relies on the tendency for newer objects to xe2x80x9cdiexe2x80x9d, or cease to be used, more frequently than older objects. Put another way, as an object is used over time, it becomes less and less likely that the object will cease being used.
A generational garbage collector manages an object heap by repeatedly scanning through newer objects in the new partition of the object heap, discarding and deallocating the memory for objects that are no longer in use, and moving objects that live beyond a threshold period of time into the old partition of the object heap. Given that older objects tend to be more stable, typically no scanning of the old partition of the object heap is required.
A generational garbage collector typically stores objects in the old partition of the object heap in a linear fashionxe2x80x94that is, one after another as each object is found to meet the criteria for being moved into the old partition. Likewise, objects in the new partition of the object heap are typically widely scattered throughout the new partition due to the deallocation and movement of objects by the garbage collector and the addition of new objects to then-available locations in the partitions.
As with generational garbage collectors, other copying garbage collector implementations also copy objects from a from-space to a to-space in a linear fashion. With the allocation and initial placement of objects in the from-space occurring by a process other than the garbage collector, however, objects tend to become widely scattered throughout the from-space with these copying garbage collector implementations as well.
An inherent result of the conventional manners of storing objects in an object heap is that the various objects used by a computer program can become widely dispersed throughout an object heap over time. Having a relatively wide dispersion of objects in an object heap, however, can result in sub-optimal performance for a computer due to memory swapping concerns.
Specifically, most computers rely on a multi-level memory architecture to balance memory performance with cost. Most computers include one or more levels of small capacity and high speed cache memory interfaced with a larger capacity but relatively slower main memory. As data is needed by a computer program, the data is copied from the main memory to a cache memory for access by a processor in the computer. Data is typically organized into blocks, or xe2x80x9ccache linesxe2x80x9d, with all of the data allocated to a particular cache line swapped into and out of a cache as a group.
Some computers also implement a virtual memory scheme where the addressable memory space is larger than the physical storage available for the main memory. Similar to caching, in a virtual memory scheme data is organized into pages, and data is swapped page-by-page between main memory and an external storage device such as a direct access storage device (DASD).
The primary benefit of a multi-level memory architecture is that more frequently-used data can often be maintained in higher levels of memory so that the data can be accessed more quickly. Whenever data to be accessed from a cache memory is allocated to a cache line that is not currently stored in the cache memory, a cache miss occurs, and retrieval of the data takes longer since the data must be retrieved from a lower level of memory. Similarly, whenever data to be accessed from the main memory is allocated to a page that is not currently stored in the main memory, a page fault occurs, and retrieval of the data takes still longer since the data must be swapped in from external storage.
An object heap typically occupies multiple cache lines, and in many instances, multiple pages. Consequently, with a relatively wide dispersion of objects in an object heap, successive accesses to objects located in different locations in the object heap may require a substantial amount of memory swapping within each cache memory, and possibly within the main memory as well. As a result, accesses to the object heap tend to take longer, and the computer operates at less than peak efficiency.
Therefore, a substantial need exists in the art for a manner of improving the performance of a computer in accessing an object heap, particularly through improved garbage collection techniques.
The invention addresses these and other problems associated with the prior art by providing an apparatus, program product, and method that organize data objects in an object heap based upon access relationships between the data objects. By doing so, data objects that are accessed in close succession with one another are more likely to be located within the same page, and possibly within the same cache line, as one another. Consequently, when accessing such objects, the frequency of memory swapping within a multi-level memory architecture (e.g., within a particular cache memory, or within a virtual memory scheme, as appropriate for the particular architecture) is often reduced, resulting in an overall improvement in system performance.
An access relationship between two or more data objects may be based at least in part upon a likely temporal proximity of the accesses to such data objects, e.g., when a group of objects are typically accessed one after another during execution of a computer program. An access relationship between multiple data objects may also be based at least in part on the relative frequencies of access of different data objects. Other manners of defining an access relationship will become apparent to one of ordinary skill in the art upon reading the disclosure presented herein.
In one specific embodiment of the invention, the ordering of data objects in an object heap relies on profiled access information generated for different classes of data objects used by a computer program. The profiling may be performed at various times, including during compilation, loading, or execution of a computer program. Typically, information such as the number of accesses to data objects and/or the order in which data objects are accessed may be predicted and utilized in determining how to order data objects within an object heap.
These and other advantages and features, which characterize the invention, are set forth in the claims annexed hereto and forming a further part hereof. However, for a better understanding of the invention, and of the advantages and objectives attained through its use, reference should be made to the Drawings, and to the accompanying descriptive matter, in which there is described exemplary embodiments of the invention.