Most of the programming languages/run time systems support dynamic memory allocation and reclamation. In object-oriented languages, memory can be reserved and released on a per-object basis, e.g., through object acclamation and reclamation. In some languages, for example, C++, freeing memory occupied by an object is done explicitly, by calling a special system function. In other object-oriented languages, e.g., Java, that feature so-called automatic memory management, memory occupied by objects that are not in use anymore is reclaimed automatically by a run time subsystem called a garbage collector. In Java, an object is considered unused and available for reclamation if it is not reachable directly or transitively from any object graph root. These roots (omitting some second order implementation specific details) are stack frames, i.e., object type local variables of currently executing methods, and object type static variables of currently loaded classes.
A memory leak in a program written on a language such as C++, with manual memory management, is a well-known problem that happens when a program does not explicitly free some objects or rough memory area that is previously reserved. If in the course of program execution, allocations without reclamation repeat over and over again, these allocations may ultimately exhaust all the available memory, causing the program to crash.
A language such as Java, that features automatic memory management, is in theory designed to avoid exactly this kind of a problem. Thus memory leaks in “C++ sense” are not possible in Java, since every object that is not reachable will sooner or later be automatically reclaimed. However, another kind of memory leaks is still possible in Java. Such leaks happen when some object remains reachable, but is not used anymore, i.e. the program does not read or write its data fields. For example, a program may allocate a temporary object, attach it to some permanent, automatically growable data structure (such as an instance of java.util.Vector), use this object for some time, and then (logically) discard it. However, the object remains attached to the permanent data structure, and, though not used, cannot be reclaimed by the GC. Over time, a large number of such unused objects can exhaust the memory available for the program, making the latter stop.
A more subtle kind of a memory leak is when some data structure is designed poorly, and keeps growing unlimited when it shouldn't. A classical example is a persistent object cache that is not flushed properly. Strictly speaking, objects in such a cache are not unused—the program can request any of them at any moment. However, if the cache does not take care of evicting some objects periodically, it may ultimately grow too large, again exhausting the memory available for the program.
In light of the foregoing, it is desirable to implement a scheme for a method to identify memory leaks occurring in an object-oriented program. More specifically, a programmer looking for a memory leak typically needs to (a) identify particular objects that are leaking, and (b) find out why they are leaking, i.e. what other objects reference the leaking one(s), and thus prevent them from being reclaimed by the garbage collector.
In U.S. patent application Ser. No. 10/893,069, a method that identifies particular objects that are leaking is provided. A tool that uses this method can “pinpoint” objects that are likely leaking, giving the programmer, for example, their addresses in memory, contents, locations in program where they have been allocated and so on.
However, problem (b) above is not addressed. Existing tools typically provide a partial solution to this problem through the so-called “heap dump” feature. One skilled in the art will appreciate that the contents of the entire object heap of the application in question can be dumped and analyzed through this feature. By analyzing a heap dump, the programmer can identify chains of references from garbage collector roots to leaking objects. This information is sometimes sufficient to determine the root cause of the leak.
In certain situations, however, knowing just what objects hold a leaking object in memory is still insufficient. This is typically the case when these other objects appear to be created and managed by some third-party libraries or other code, with which the programmer is unfamiliar. In that case knowing the types, contents, etc., of these objects responsible for a leak, may not be of much use to the programmer. What the programmer needs to understand is what actions (e.g., function calls) in the program resulted in a leaking object attached to certain data structures, which prevent it from being garbage collected. Described below is a method and system that implements a scheme to identify memory leaks occurring in an object-oriented program.