1. Field of the Invention
The present invention generally relates to performing a memory leak analysis in a computing system. More particularly, the present invention relates to a method, system, computer program product, and computer program storage device for performing a memory leak analysis inside a virtual machine by utilizing a depth first search (DFS) algorithm.
2. Description of the Prior Art
A lot of computer programs suffer from memory leaks, where an amount of memory used by the programs increases over time due to a bug. In C language, a memory leak is created by losing a pointer to a section of memory. Java and other managed runtime environments (e.g., .NET) do not suffer from this class of memory leak (e.g., a memory leak caused by losing a pointer), as Java and other managed runtime environments use automatic garbage collection. However, in Java and .NET, it is still possible for a programmer to write a program that creates objects, uses them, then does not correctly make them available for garbage collection. For example, a memory leak pattern in Java or .NET is usually caused from a collection class with a large number of objects. In the memory leak pattern in Java or .NET, a programmer makes a mistake that does not remove objects from the collection class when the objects are no longer needed. This mistake usually results in a very large collection class. There are currently two main ways to analyze this memory leak.
A first way is simply counting the number of objects of each class. The first way does not require a complex algorithm, so is a fast technique. The first way may be able to identify objects which are not being cleaned up, but does not give a user any information about a thread, a method and an object that are causing a memory leak. Since the first way is fast, it can be performed at runtime, inside a runtime environment (e.g. the Java Virtual Machine). In addition, the first way can be performed outside a runtime environment such as a dump of a heap (i.e., transferring contents of a heap in a main memory to a file). Examples:
BEA® JRocket® Memory Leak Detector (http://e-docs.bea.com/jrockit/tools/usingmmleak/index.html)
A depth first search (DFS) is a method of traversing or searching a tree structure (e.g., a binary tree, heap) to reach all nodes in the tree. Consider a tree shown in FIG. 1. The DFS algorithm visits children nodes first before visiting sibling nodes by moving down the tree constantly until it is impossible to do so. Traversing by the DFS usually starts at a root node and explores as far as possible along each branch before backtracking (i.e., retracing one step back to come back to a parent node and exploring the other available paths). By performing DFS traversal, all nodes in a tree can be searched. For example, the DFS algorithm traverses nodes in FIG. 1 in following order (e.g., by starting from a node A):
Current node: A
Get children of this node: B and C
Move down to first child: B
Get children of this node: D and E
Move down to first child: D
Get children of this node: none
Move back up to parent: B
Move down to next child: E
etc.
A second way is walking (i.e., searching) a heap (i.e., a traditional main memory is divided to two sections: stack—storing code currently executing (small area); heap—storing all other code of a program and storing all saved data (large area)) to ascertain which objects are referencing which other objects, and hence how much memory the objects are referencing. The second way is performed by means of a depth first search (DFS). There are a number of tools that utilize the second way: HeapRoots (http://www.alphaworks.ibm.com/tech/heaproots), HeapAnalyzer (http://www.alphaworks.ibm.com/tech/heapanalyzer), and FindRoots (http://www-1.ibm.com/support/docview.wss?rs=3182&context=SSSTCZ&dc=D400&uid=swg24009436&loc=en_US&cs=UTF-8&lang=en). The second way (i.e., a method utilizing the DFS) has an advantage that it provides more useful information than the first way. Besides, the second way (i.e., a method utilizing the DFS) provides a chain of objects that are related to a memory leak and an amount of memory referenced. This information is not provided by the first way. However, the second way has a disadvantage that, because the second way involves maintaining state information of each object (e.g., an amount of memory referenced by each object) while making a traversal of the heap, the second way uses very large amounts of memory. Therefore, the second way can only be used outside a runtime environment (e.g., an executing a virtual machine state which provides software services for processes or programs while a computer is running), acting on a dump of a heap.
Thus, it is highly desirable to provide a memory leak analysis that uses a small amount of memory space, provides useful information (e.g., a chain of objects, an amount of memory referenced), and is executable at a runtime environment.