1. Field of the Invention
The present invention relates to techniques for detecting memory leaks in computer systems. More specifically, the present invention relates to a method and apparatus for detecting memory leaks by observing the historical memory-usage of processes in a computer system.
2. Related Art
Memory leaks can adversely affect computer system performance and can significantly decrease software reliability. Moreover, memory leaks can remain in programs despite extensive testing during the development phase and despite the use of garbage collection techniques during runtime. Long-running programs with memory leaks and programs that allocate memory extensively can consume enough resources in a computer system to seriously hinder performance, or even worse to cause application or system crashes. This problem is more acute in a multi-user environment, where a large numbers of users can be affected by a single application or a single process which has a memory leak. If applications or processes with memory leaks are detected in advance, preventive actions can be taken to avoid serious problems affecting many users.
Many programming languages provide mechanisms for explicit dynamic allocation and deallocation of memory during program execution. After using a dynamically allocated object, if there is no more need for it, the memory consumed by the object should be explicitly released. Failure to release the memory consumed by the object can cause a memory leak.
Memory leaks are common in programming languages such as C and C++, which rely heavily on pointer arithmetic and which do not provide a garbage-collection mechanism.
However, garbage collection does not necessarily prevent memory leaks from occurring. Note that a garbage collection mechanism cleans up objects only if there are no references to the objects. Consequently, memory leaks can occur in a garbage-collected system if applications continually generate a large number of referenced objects which eventually become unused but remain referenced.
A memory leak causes the computer system as a whole, not merely the erroneous process, to use an ever growing amount of memory. Eventually, much (or all) of the available memory will be allocated (and not freed), thereby causing the entire system to become severely degraded or to crash.
System administrators typically do not get a warning that there is a problem until 95%-98% of the available memory has been used up. At this point, the system administrator typically identifies processes consuming the largest amounts of memory, and then terminates these processes in an effort to prevent a system crash. However, the terminated processes may not be the ones that actually have a memory leak; they may simply be processes that use a lot of memory. Moreover, well before the system administrator starts taking remedial actions, individual user processes may request more memory than is available, which can cause processes to swap to disk, thereby greatly decreasing performance of the process.
Tools are available for debugging programs and for detecting memory leaks when the source code is available. However, these tools cannot be used when the source code is not available; for example, when third-party and off-the-shelf software is used.
Another technique to detect memory leaks involves detecting gradual resource exhaustion in computer systems. This technique uses time-series analysis to detect trends in resource usage and to estimate the time until resource exhaustion. Preventive actions, such as software rejuvenation operations, can be taken to avoid any impending failure. The drawback of this technique is that it does not pinpoint the offending process, and hence, the entire system may have to be rebooted. Furthermore, this technique provides no feedback to facilitate root-cause analysis. Another drawback is that subtle memory leaks cannot be detected when the memory usage is large and “noisy,” which commonly occurs in multi-user server systems.
Hence, what is needed is a method and an apparatus for detecting the onset of memory leaks in computer systems without the problems described above.