As defined by Microsoft® Computer Dictionary, Fourth Edition, Microsoft Press (1999), the heap is a portion of memory in a computer that is reserved for a program to use for the temporary storage of data structures whose existence or size cannot be determined until the program is running. To build and use such elements, programming languages such as C and Pascal include functions and procedures for requesting free memory from the heap, accessing it, and freeing it when it is no longer needed. In contrast to stack memory, heap memory blocks are not freed in reverse of the order in which they were allocated, so free blocks may be interspersed with blocks that are in use. As the program continues running, the blocks may have to be moved around so that small free blocks can be merged together into larger ones to meet the program's needs.
Modern software packages allocate and manage a vast amount of information on the heap. Object oriented languages such as Java and C# almost exclusively use the heap to represent and manipulate complex data structures. The growing importance of the heap necessitates detection and elimination of heap-based bugs. These bugs often manifest themselves in different forms, such as dangling pointers, memory leaks, and inconsistent data structures.
Unfortunately, heap-based bugs are hard to detect. The effect of these bugs is often delayed, and may be apparent only after significant damage has been done to the heap. In some cases, the effect of the bug may not be apparent. For instance, a dangling pointer bug does not crash the program unless the pointer in question is dereferenced, and on occasion, may not cause a crash even then. Consequently, software testing is not very effective at identifying heap-based bugs. Because of the non-deterministic nature of heap-based bugs, even if the buggy statement is executed on a test run, it is not always guaranteed to crash the program, or produce unexpected results. Moreover, the effect of heap-based bugs is often delayed, as a result of which testing does not reveal the root-cause of the bug.
Static analysis techniques, such as shape analysis (see, e.g., M. Sagiv, T. W. Reps, and R. Wilhelm, “Parametric Shape Analysis Via 3-Valued Logic,” ACM Trans. Prog. Lang. Syst. (TOPLAS), 24(3):217-298, May 2002), overcome these limitations. They examine all valid code paths, and can also provide soundness guarantees about the results of the analysis. Shape analysis has enjoyed success at determining the correctness of, or finding bugs in algorithms that manipulate heap data structures. However, in spite of recent advances (such as described by B. Hackett and R. Rugina, “Region-Based Shape Analysis With Tracked Locations,” Proc. 32nd Symp. on Princ. of Prog. Lang. (POPL), January 2005; and E. Yahav and G. Ramalingam, “Verifying Safety Properties Using Separation And Heterogeneous Abstractions,” Proc. ACM SIGPLAN Conf. On Prog. Lang. Design and Impl., pages 25-34, June 2004), shape analysis algorithms are expensive, and apply only to limited classes of data structures, and properties to be checked on them. Moreover, the results of static analysis, while sound, are often overly conservative, and over approximate the possible set of heap configurations.
On the other hand, dynamic analysis techniques have the advantage of precisely capturing the set of heap configurations that arise. Several dynamic analysis tools have been developed to detect special classes of heap-based bugs. (See, e.g., T. M. Chilimbi and M. Hauswirth, “Low-Overhead Memory Leak Detection Using Adaptive Statistical Profiling,” Proc. 11th Intl. Conf. on Arch. Support for Prog. Lang. and Op. Sys. (ASPLOS), pages 156-164, October 2004; B. Demsky and M. Rinard, “Automatic Detection And Repair Of Errors In Data Structures,” Proc. 18th ACM SIGPLAN Conf. on Object-Oriented Prog., Systems, Lang. and Appls. (OOPSLA), pages 78-95, October 2003; R. Hastings and B. Joyce, “Purify: Fast Detection Of Memory Leaks And Access Errors,” Winter USENIX Conference, pages 125-136, January 1992; and N. Nethercote and J. Seward, “Valgrind: A Program Supervision Framework,” Elec. Notes in Theor. Comp. Sci. (ENTCS), 89(2), 2003.) However, there has been relatively little research at understanding the runtime behavior of the heap, and applying this information for bug finding.