1. Field of the Invention
The present invention generally relates to computer programming and, more particularly, to a method and variant of the method for a compiler (either static or dynamic), programming development environment or tool, or programmer to enable transformation of a program or part of a program so as to reduce the overhead of allocating and deallocating objects, while strictly preserving the exact semantics of the program or parts of the program that existed before the transformation.
2. Background Description
Programming languages, for example, the Java(trademark), C and C++ programming languages support heap allocation for data that is dynamically created at arbitrary points during program execution (see D. Gay and Aiken, xe2x80x9cMemory Management with Explicit Regions, Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Montreal, Canada, June 1998). Variables representing data allocated in a program shall be referred to as objects, and variables that point to the objects will be referred to as pointers. In some languages like the C and C++ programming languages, the programmer is also given explicit control over deallocation of objects, using a free statement. However, the programmer has to be careful so as not to deallocate objects to which there is a pointer from any other variable referenced after the deallocation point. Otherwise the dangling pointer to an object that has already been deallocated can lead to an error during program execution. In order to free the programmers from the burden of determining when it is safe to deallocate objects, some languages like Java(trademark) support garbage collection, where the run-time system assumes responsibility for determining when the storage for an object can safely be reclaimed. (Java is a trademark of Sun Microsystems, Inc.)
Every algorithm for garbage collection leads to some overhead being incurred at run-time while identifying the objects whose storage can be reclaimed. A compiler can bypass garbage collection of objects with known lifetimes. In particular, if the lifetime of an object is bounded by the lifetime of the stack frame associated with a procedure, the object can be allocated on the stack rather than the heap. The storage associated with the stack is automatically reclaimed when the procedure returns.
There are two different ways of allocating storage on the stackxe2x80x94dynamic allocation or static allocation. Dynamic stack allocation can be done by dynamically extending the size of the stack during execution. A well-known mechanism for such dynamic extension of the stack is via a system call named alloca on UNIX(copyright)-based systems. Static allocation of an object on the stack involves using a fixed location within the stack frame for that object. In order to use static allocation for an object determined to be stack-allocatable, the compiler has to further know the size of the object and check for one of the following two conditions: (i) either the original heap allocation does not take place inside a loop in the given procedure, or (ii) instances of the object allocated in different loop iterations should have non-overlapping lifetimes. (UNIX is a registered trademark of SCO.)
An alternate approach to stack allocation is to use a region in the heap for allocating objects whose lifetimes are bounded by the lifetime of the stack frame, and deallocating the entire region when the corresponding procedure returns. For simplicity of presentation, due to the conceptual similarity of these approaches, such a region-based approach shall be regarded as equivalent to performing stack allocation of data.
Many compilers use a representation called a call graph to analyze an entire program. A call graph has nodes representing procedures, and edges representing procedure calls. The term procedure is used to refer to subroutines, functions, and also methods in object-oriented languages. A direct procedure call, where the callee (called procedure) is known at the call site, is represented by a single edge in the call graph from the caller to the callee. A procedure call, where the callee is not known, such as a virtual method call in an object-oriented language or an indirect call through a pointer, is represented by edges from the caller to each possible callee. It is also possible that given a particular (callee) procedure, all callers of it may not be known. In that case, the call graph would conservatively put edges from all possible callers to that callee.
Within a procedure, many compilers use a representation called the control flow graph (CFG). Each node in a CFG represents a basic block and the edges represent the flow of control among the basic blocks. A basic block is a straight-line sequence of code that has a single entry (at the beginning) and a single exit (at the end). A statement with a procedure call does not disrupt a straight-line sequence of code. In the context of languages that support exceptions, such as the Java(trademark) programming language, the definition of a basic block is relaxed to include statements which may throw an exception. In those cases, there is an implicit possible control flow from a statement throwing an exception to the block of code handling the exception. The basic block is not forced to end at each such statement, and instead, such a basic block bb is referred to as having a flag bb.outEdgeInMiddle set to true.
A topological sort order enumeration of nodes in a graph refers to an enumeration in which, if the graph contains an edge from node x to node y, then x appears before y. If a graph has cycles, then such an enumeration is not guaranteed for nodes involved in a cycle. A reverse topological sort order lists nodes in the reverse order of a topological sort.
Prior art for a similar goal of reducing the overhead of heap allocation and deallocation can be found in the following papers: A. Aiken, M. Fahndrich, and R. Levien, xe2x80x9cBetter Static Memory Management: Improving Region-Based Analysis of Higher-Order Languagesxe2x80x9d, Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, Calif., June 1995; L. Birkedal, M. Tofte, and M. Vejlstrup, xe2x80x9cFrom Region Inference to von Neumann Machines via Region Representation Inferencexe2x80x9d, Proceedings of 23rd ACM Symposium on Principles of Programming Languages, St. Petersburg, Fla., January 1996; B. Blanchet, xe2x80x9cEscape Analysis: Correctness, Proof, Implementation and Experimental Resultsxe2x80x9d, Proceedings of 25th ACM Symposium on Principles of Programming Languages, January 1998; A. Deutsch, xe2x80x9cOn the Complexity of Escape Analysisxe2x80x9d, Proceedings of 24th ACM Symposium on Principles of Programming Languages, San Diego, January 1997; D. Gay and A. Aiken, xe2x80x9cMemory Management with Explicit Regionsxe2x80x9d, Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Montreal, Canada, June 1998; J. Hannan, xe2x80x9cA Type-Based Analysis for Stack Allocation in Functional Languagesxe2x80x9d, Proceedings of 2nd International Static Analysis Symposium, September 1995; Y. G. Park and B. Goldberg, xe2x80x9cEscape Analysis on Listsxe2x80x9d, Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, July 1992; and C. Ruggieri and T. P. Murtagh, xe2x80x9cLifetime Analysis of Dynamically Allocated Objectsxe2x80x9d Proceedings of 15th ACM Symposium on Principles of Programming Languages, January 1988. Those methods do not handle programs with explicit constructs for multithreading and exceptions (e.g., the try-catch constructs in the Java(trademark) programming language).
Prior art for replacing dynamic heap allocation of objects by stack allocation can be found in the paper by C. Ruggierie and T. P. Murtagh, supra. The analysis presented by C. Ruggierie and T. P. Murtagh is quite conservative. In particular, this method makes pessimistic assumptions about aliasing between different function parameters and variables accessed through the parameters.
Prior art for replacing dynamic heap allocation of objects by stack allocation for functional languages is presented in the papers by J. Hannan and Y. G. Park and B. Goldberg, supra. The methods described in these papers are restrictive, as they do not handle imperative languages like the Java(trademark), C and C++ programming languages.
Prior art for replacing dynamic heap allocation of objects by stack allocation for list-like objects in functional languages is presented in the papers A. Deutsch and B. Blanchet, supra. These methods do not handle programs with explicit constructs for multithreading and exceptions.
Prior art for replacing dynamic heap allocation of objects by stack allocation for the Java(trademark) programming language is found in the paper by D. Gay and B. Steensgaard, xe2x80x9cStack Allocating Objects in Java, Research Reportxe2x80x9d, Microsoft Research, 1999. The method described by D. Gay and B. Steensgaard simply marks an object as not stack-allocatable if a reference to the object is ever stored in a field of another object. Therefore, this method fails to detect many objects, which can safely be allocated on the stack, as stack-allocatable.
The present invention is a method for a compiler, programming development environment or tool, or programmer to enable transformation of a program or parts of a program written in some machine language so as to reduce the overhead of allocating objects on the heap, or deallocating objects from the heap, or both.
More particularly, the present invention provides a method to analyze a computer program so that the information can be used to reduce the overhead of allocation and deallocation of objects in the program. The invention identifies those objects allocated on the heap that can instead be allocated on the stack frame of a procedure, without changing the semantics of the original program. The storage for these objects is automatically reclaimed when the procedure, on whose stack frame the objects were allocated, returns.
The preferred method of the present invention performs an interprocedural analysis of the program. For each procedure analyzed by the compiler, reachability relationships are identified among the different objects, fields of objects, and pointers referenced in the procedure. For each object allocated in the procedure, if at the end of the procedure, it is not reachable from any global variable, parameter of the procedure, or the return value of the procedure, the object is identified as stack-allocatable. A later pass of the compiler can use stack allocation for a stack-allocatable object in place of heap allocation.
The preferred method computes a succinct summary of the effect of a procedure call on the reachability relationships among objects visible to the caller of that procedure. The method is able to summarize this effect for different calling contexts of the procedure (i.e., for different aliasing relationships between parameters and objects reachable from parameters that hold at different call sites for that procedure) using a single summary representation. During the interprocedural analysis of the program, in a bottom-up traversal over the call graph of the program, the summary reachability information computed for a callee procedure is used to update the reachability information for actual parameters, return value, and variables reachable from those parameters and return value at the call site within the caller procedure. If there are cycles in the call graph (due to the use of recursion in the program), the analysis is performed in an iterative manner until a fixed point solution is obtained.
The method of the present invention can deal with language features like destructive updates of variables, multithreading, exceptions, and finalizer methods, which can complicate the compiler analysis. In particular, the preferred method of this invention correctly handles exceptions in the program, while taking advantage of the information about the visibility of variables in the exception handler.
An alternative embodiment of the method deals with cycles in the call graph (arising from the use of recursion) conservatively. It imposes an upper limit on the number of times each procedure is analyzed, rather than iteratively analyzing procedures involved in the cycle until the computed solution converges.
Another alternative embodiment of the method takes advantage of the type information of variables, when analyzing programs written in a type-safe language, such as the Java(trademark) programming language. The type information is used to obtain less conservative information about the reachability effects on variables of a procedure that is not analyzed by the compiler due to various reasons, such as the code for the procedure being unavailable or to make the analysis faster.
Another alternative embodiment of the method uses a less precise representation to distinguish between different fields of an object, to reduce the storage requirements of the method and to make it faster, at the expense of some precision in the computed result.