In the Java programming environment (Java is a trademark of Sun Microsystems Inc.), programs are generally run on a virtual machine, rather than directly on hardware. Thus a Java program is typically compiled into byte-code form, and then interpreted by the Java virtual machine (VM) into hardware commands for the platform on which the Java VM is executing. The Java environment is further described in many books, for example “Exploring Java” by Niemeyer and Peck, O'Reilly & Associates, 1996, USA, “Java Virtual Machine”, by Meyer and Downing, O'Reilly & Associates, 1997, USA, and “The Java Virtual Machine Specification” by Lindholm and Yellin, Addison-Wedley, 1997, USA.
Java is an object-oriented language. Thus a Java program is formed from a set of class files having methods that represent sequences of instructions. One Java object can call a method in another Java object. A hierarchy of classes can be defined, with each class inheriting properties (including methods) from those classes that are above it in the hierarchy. For any given class in the hierarchy, its descendants (i.e. below it) are called subclasses, while its ancestors (i.e. above it) are called superclasses. At run-time classes are loaded into the Java VM by one or more class loaders, which are themselves organised into a hierarchy. Objects can then be created as instantiations of these class files, and indeed the class files themselves are effectively loaded as objects. The Java VM includes a heap, which is a memory structure used to store these objects.
Once a program has finished with an object stored on the heap, the object can be deleted to free up space for other objects. In the Java environment, this deletion is performed automatically by a system garbage collector (GC). This scans the heap for objects that are no longer referenced, and hence are available for deletion. Note that the precise form of GC is not prescribed by the Java VM specification, and many different implementations are possible.
Some implementations of the Java VM incorporate a card table as an adjunct to the heap. The card table comprises a set of cards, each of which corresponds to a fixed chunk of the heap (say 512 bytes). The card effectively acts as a flag to indicate the status of the corresponding portion of memory, in particular the card or flag is typically set when a pointer is written into the corresponding portion of memory. The card table is therefore used to keep track of changes to the heap.
There are a variety of circumstances for wanting to utilize a card table, mostly connected with GC. For example, some known forms of GC are based on the fact that the longer an object has survived already, then generally the longer it will survive in the future. Consequently, the heap is split into two components or generations; one having newly created or young objects, the other having older objects. In this approach, it is efficient to perform GC more frequently on the young heap than the old heap, since the hit rate (of deletions) is likely to be higher for the young heap. However, it is important not to GC an object from the young heap while it is still being referenced from the old heap (such a reference can be termed a cross-heap pointer); otherwise such reference would be invalid after the GC.
Nevertheless, it is undesirable to have to scan the whole of the old heap for possible cross-heap pointers, since this is time-consuming. This problem can be alleviated by use of the card table. Thus, when a GC is performed of the young heap, any cards in the card table are identified that correspond to the old heap portion of memory, and that have been set. These cards indicate the only portions of the old heap that could possibly have a cross-heap pointer, and so only these portions need to be scanned for a cross-heap pointer, rather than the whole heap (note that simply because a card is set does not mean that there is necessarily still a cross-heap reference in the corresponding portion of memory, for example this pointer may have been subsequently nulled; in addition some implementations may mark the relevant card when any reference is written to the heap, deferring until later any checks as to whether the reference is a potentially problematic cross-heap reference, or simply a harmless reference to some local object).
If any cross-heap pointers are identified at this stage, then the referenced objects in the young heap are typically transferred (“promoted”) to the old heap. This then allows the card table to be reset, since it is known that there are currently no cross-heap pointers, and the young portion of the heap can be garbage collected.
Another (related) use of card tables is in mostly concurrent garbage collectors. Thus conventional GC strategies typically involve stopping all threads to determine references between objects, but clearly this has a significant impact on system performance. A mostly concurrent garbage collector effectively stops one thread at a time to look at its memory usage. However, it is more difficult in this situation to determine the overall representation of references, because while one thread has stopped, other threads are potentially updating references. The card table can be used to track these updated references, effectively identifying those portions of the heap that must be double-checked before GC can be finalised.
The prior art contains many documents concerning the use of card tables in generational or mostly concurrent garbage collectors, see for example: U.S. Pat. Nos. 5,953,736, 5,845,298, 6,098,089, 6,173,294, 6,185,581, 6,249,793; “A Generational Mostly-concurrent Garbage Collector” by Tony Printezis and David Detlefs, presented at the International Symposium on Memory Management, Oct. 15-16, 2000, Minnesota, USA, (SIGPLAN Not. (USA), Vol. 36/1, January 2001, p143-154), and “Parallel Garbage Collection for Shared Memory Multiprocessors” by Flood, Detlefs, Shavit, Zhang, presented at USENIX Java Virtual Machine Research and Technology Symposium, Apr. 23-24, 2001, California, USA.
Another situation in which card tables are used is in relation to the IBM product: CICS Transaction Server for z/OS Version 2 Release 1. This incorporates a Java VM that is specially designed to be reusable for running successive transactions on the same VM. One of the ways in which this is implemented is by splitting the heap into two components, a transient portion and a persistent component. The former contains objects specific to a particular transaction, and is deleted at the end of the transaction (known as reset); the latter contains middleware objects that effectively provide the transaction processing environment, and so survive from one transaction to another.
To be able to delete the transient heap at the end of a transaction, it is necessary to ensure that there are no cross-heap pointers (from the persistent heap to the transient heap). Again, a card table is used to track pointer updates to the heap. At reset, only those portions of the persistent heap whose corresponding card has been marked need to be checked for cross-heap pointers. Further details about the use of a card table for a reusable VM can be found in: “A Serially Reusable Java Virtual Machine Implementation for High Volume, Highly Reliable Transaction Processing”, IBM Technical Report TR 29.3406, available from the location tlg/tr.nsf/TRbyNumber at http://wwwidd.raleigh.ibm.com/.
In all instances where a card table is used, the setting of a card is performed by a write barrier—i.e. a piece of code that is invoked whenever a reference is written to the heap. It is important that write barriers are extremely efficient pieces of code, since they can be called many times. It is therefore generally desirable to optimize the work of the write barrier for marking the card as much as possible.
One piece of processing that the write barrier must perform is to map from the heap address that is being updated to the corresponding card location, so that the correct card can be set. Thus a traditional card marking scheme works by mapping areas of memory (segments) of a defined address range, the heap, to cards within the card table, where a card represents a segment within the heap. Typically this mapping can be performed in an efficient manner by calculating an index into the card table by determining the offset of the address into the heap (i.e by subtracting the base address of the heap), and then dividing that result by the size of the chunk of memory (segment size) corresponding to a single card.
More particularly, for a heap of a given size we can calculate the required number of cards in the card table T required as follows: T=((heap top−heap base)/segment size). The index of the card C that represents a given address X can then easily be calculated as follows: C=(X−address(heap base))/segment size. The reverse algorithm, to map a card index to the heap address of a segment is: X=(C*segment size)+address(heap base).
However, the heap does not represent the full extent of memory. In some situations, described in more detail below, a heap updating program might actually update a stack or a heap, and not know when it is doing each. An update to the stack is to an address outside of the heap. If the system tries to map an address outside the heap into the card table, the index calculated using the above formulae will be outside the card table. Thus the above algorithm only works provided “X” is guaranteed to be within the heap (i.e. heap base=<X<=heap top). If we apply this algorithm to an address outside the heap we will calculate a card index outside the bounds of the card table and storage violations/addressing exceptions will result.
In situations where it is not possible to guarantee that the address X is always within a defined address range of the heap, the problem could be solved in a number of ways. One possibility is to make the card table big enough to map the whole addressing range and then check, for example, only those portions of the card table that correspond to the heap (and possibly the stack) for possible updates. Although this is simple in principle to implement, and certainly feasible on a 32 bit system, it becomes problematic on a 64 bit system because of the size of card table required to reflect the increased address space. An alternative approach is to add range checks to the algorithm, to ensure that an address is within the heap before calculating the card index. Again this is relatively straightforward to implement (providing the heap limits are known). However, it does have the significant drawback of significantly increasing the path length of the write barrier, and therefore having a detrimental effect on overall performance. Because administration of a heap and especially garbage collecting of a heap is so important to system performance, it is important to find improved administration algorithms.