1. Field of Invention
The present invention relates to the programming of a data processing system. More specifically, the invention relates to the implementation of advanced computer language features such as tail-recursion optimization, first-class continuations and garbage collection when implementing applications in a stack-oriented computer language.
2. Field of the Prior Art
Stack-oriented computer languages--e.g., Algol, PL/I, Pascal, Ada, C, C++, to name just a few--are utilized to program a large fraction of the computer applications software in use today, and the architectures of a large fraction of computers are optimized for these types of languages. The techniques for efficiently implementing these languages are well-known in the art of computer science, and are covered in a number of textbooks--e.g., [Aho86]. (References in square brackets appearing in the specification are described in Appendix A attached hereto.)
However, a number of advanced computer languages--e.g., Scheme [Scheme90], Smalltalk [Goldberg83], and ML [Milner90], to name just a few--incorporate advanced features that are not easily implemented on computer architectures optimized for stack-oriented languages. Some of these advanced features include tail recursion, first-class continuations, and garbage collection.
Tail recursion is often considered an optimization, which is a program transformation that improves the efficiency of the program in execution time or storage space. Tail recursion is a process of replacing a certain type of recursive computation by an iterative computation which more efficiently produces the same effect and/or result.
According to [Aho86, pp. 52-53], "Certain recursive calls can be replaced by iterations. When the last statement executed in a procedure body is a recursive call of the same procedure, the call is said to be tail recursive. . . . We can speed up a program by replacing tail recursion by iteration. For a procedure without parameters, a tail-recursive call can be simply replaced by a jump to the beginning of the procedure."
According to [Abelson85, p. 33] "Tail recursion has long been known as a compiler optimization trick. A coherent semantic basis for tail recursion was provided by Carl Hewitt . . . Inspired by this, Gerald Jay Sussman and Guy Lewis Steele Jr. [Sussman75] constructed a tail-recursive interpreter for Scheme. Steele later showed how tail recursion is a consequence of the natural way to compile procedure calls."
An advanced language that guarantees tail recursion--e.g., IEEE standard Scheme [Scheme90]--does not require special iterative or looping constructs in order to execute loops and iteration efficiently. Thus, the `do`, `for`, `while`, and `until` constructs found in languages like Fortran, Algol, PL/I, Pascal, C, C++, to name just a few, can be replaced by recursions which can be compiled as efficiently as the iterative looping constructs. This efficiency means that the complexity of compilers and programming languages which deal specially with looping constructs can be reduced by utilizing an efficient form of recursion in its place. Furthermore, according to [Kernighan88], ". . . recursive code is more compact, and often much easier to write and understand than the non-recursive equivalent."
Stack-oriented languages do not guarantee tail-recursive implementation of subprogram calls. Some particular compilers may provide for a tail recursion optimization in some particular cases, but a portable program written in a stack-oriented language cannot rely upon tail recursion on a wide variety of implementations of the language.
Another advanced computer language feature is that of first-class continuations. Continuations can be used to implement non-local transfers of control. A relatively simple kind of non-local transfer of control is that of the ANSI C language setjmp/longjmp pair [Kernighan88] [Harbison91] [Plauger92]. A C program may execute a setjmp function and then call a number of other nested functions. Within this nesting, a longjmp function can be executed which transfers immediately back to the context saved by the setjmp function. Any functions in the nest which have been called, but have not yet returned, will never return, and will simply be abandoned. A more sophisticated use of continuations is to implement multiple processes by means of interrupts and time-sharing [Wand80]. A still more sophisticated use of first-class continuations is Prolog-like back-tracking [Haynes87].
Stack-oriented languages implement only the simplest kind of continuations, if they implement continuations at all. ANSI C [Kernighan88] [Harbinson91] [Plauger92] is typical in that it defines the meaning of setjmp/longjmp only in the cases where the longjmp is dynamically nested within the enclosing setjmp. Furthermore, C makes no provision whatsoever for saving the result of a setjmp as a first-class continuation data object which can be passed as an argument to a function, returned as the result of a function, or stored into a programmer-defined data structure.
Another advanced computer language feature is automatic storage reclamation or garbage collection. Computer languages, including stack-oriented computer languages such as PL/I, Pascal, Ada, C, C++, to name a few, have long offered dynamic allocation of data objects both on the stack and in a separate area usually called the `heap` [Aho86]. The heap allocation of objects is utilized whenever the lifetime of these objects does not match the LIFO (Last-In, First-Out) allocation/deallocation behavior of a stack. In these languages, it is the responsibility of the programmer to deallocate data objects which are no longer in use, so that the storage they occupy can be reused for a new data object.
There is a problem, however. When pointers/references to a data object are stored in other data objects, it may be quite difficult for a programmer to make sure that a data object is no longer in use before he or she deallocates it. This problem is particularly severe in large applications which have developed over a number of years with a large number of programmers and which interface to software for which the programmer may not have access to the source code. As a result, a programmer may inadvertently deallocate an object which is still in use and subsequently reuse this storage for another purpose. When the deallocated object is referenced again, the application will usually fail--sometimes in a catastrophic manner. This problem is known as the `dangling pointer` problem.
One attractive solution to the dangling pointer problem is to move the responsibility of deallocation from the programmer to the programming language implementation. The parts of the system which take on this responsibility are often called automatic memory managers or garbage collectors. A garbage collector operates by looking at all of the application program variables which are directly accessible, and then following all chains of pointers from these variables to objects in the stack and the heap. Any object found in this way is called `accessible`, because it can conceivably be accessed by the application program by following a finite chain of pointers from a directly accessible program variable. The storage for inaccessible objects can then be reclaimed for reuse. Alternatively, the accessible objects can all be copied and relocated to a new area of memory, and the entire old area can then be reused for a new purpose. Such an automatic memory manager or garbage collector is said to implicitly deallocate inaccessible objects.
The art of automatic memory management and garbage collection is quite advanced. [Cohen81], [McEntee87], and [Bekkers92] review some of this art.
Stack-oriented computer languages are not inconsistent with implicit storage deallocation and garbage collection--e.g., the Algol-68 computer language offers garbage collection--but few implementations offer it. The most popular stack-oriented computer languages--e.g., Pascal, Ada, C, C++, to name just a few--do not utilize implicit storage deallocation and garbage collection, and therefore applications programs written in these languages run the risk of creating `dangling references` and thereby causing catastrophic software failures called `crashes`. The number of crashes in commercially distributed software due to these dangling references is testimony to the ubiquity and seriousness of this problem.
There are two major problems in retrofitting garbage collection into a stack-oriented language. The first is in tagging all of the data objects so that the garbage collector can know the boundaries of the object, as well as finding and tracing all of the pointers within the data object. The second is in finding all of the directly accessible program variables or `roots` for the garbage collection. Some of these program variables are global and/or static, and are not usually difficult to locate and identify to the garbage collector. The more difficult problem is that of locating and identifying the program variables that have been allocated on the stack, but for which a map to their location has not be provided by the compiler.
One general approach to these problems has been called `conservative` garbage collection [Boehm88] [Bartlett88]. Conservative garbage collectors do not attempt to precisely locate and identify all of the program variables or accessible objects, but only guess at their locations. These collectors are conservative, in that they attempt to err on the side of conservatism, in that any bit pattern that looks like it might be a pointer is assumed to actually be a pointer. A conservative garbage collector will treat the entire stack area as a source of `ambiguous roots` [Bartlett88], and any storage location which is pointed at by a suspected pointer is considered to be an accessible object if it is located in one of the storage areas in which accessible objects can be found. Suspected pointers found within `objects` located in this manner are also traced by the conservative garbage collector, in case the suspected object really is accessible to the running program.
Conservative garbage collectors have two significant problems. The most common problem is that their conservatism causes them to consider too much storage as accessible. Some researchers [Zorn92] have found that a conservative garbage collector may be less efficient because it may `hold onto` significantly more storage than a more precise collector would. It may also incur a greater cost in scanning time due to its imprecise knowledge about where accessible pointers are to be found. A less common, but more troubling problem, is that a conservative collector may fail to be conservative, and may miss some accessible objects, possibly due to aggressive compiler optimizations [Chase88]. Since an object which is accessible to the program, but is not considered accessible to the conservative garbage collector, will eventually be reallocated for a new purpose, a `conservative` garbage collector may actually cause a crash due to a dangling pointer in a program that would have operated correctly without the conservative collector. Although the known occurrences of dangling reference problems with conservative garbage collectors are very rare, the mere possibility of such problems raises serious doubts about the usability of this form of garbage collector for many applications.
The art of directly implementing advanced language features like tail recursion, first-class continuations and garbage collection in machine (or machine-like) languages is well-advanced. [Hanson90] is a recent review of some techiques of tail recursion; [Clinger88] is a review of some techniques of first-class continuations; and [Bekkers92] includes reviews of some techniques for garbage collection. Appel's approach to the Standard ML of New Jersey implementation [Appel88] [Appel89] [Appel0] [Appel92] of the ML programming language [Milner90] is particularly elegant and efficient.
Unfortunately, compilers which target machine languages are expensive and time-consuming to build, and with the increased complexity of generating code for highly pipelined RISC architectures, compilers targeting machine language will become even more expensive. Thus, the costs of supporting a language implementation on a wide variety of different instruction set architectures are growing quickly. This trend has caused a tendency for machine vendors to provide one or two compilers which directly target machine code--usually C and Fortran--and those wishing to support advanced languages such as Scheme or ML will seriously consider building compilers which translate those languages into C or Fortran, so that their language implementation will remain portable over a wide variety of instruction set architectures.
Although portability (and hence lower cost) is the major advantage for compilers to target languages like C or Fortran instead of machine language, there are other advantages. There are significant execution efficiencies to be gained through proper `instruction scheduling` of complex pipelines, and since the existing C and Fortran compiler vendors already have enormous incentives to provide these difficult optimizations, a compiler which targets C or Fortran instead of machine code can `piggy-back` on these efforts to gain the advantages at very low cost. There are also a substantial number of development and debugging tools available for C and Fortran programs that may not be available for machine language programs, so additional leverage is gained for these purposes. Finally, a large number of third-party subprogram `packages` already exist in C or Fortran--e.g., for computing transcendental functions--and the compiler targeting C or Fortran can utilize these, as well.
So some of the options facing the writer of an application which requires advanced computer language features such as tail recursion, first-class continuations, and/or garbage collection is to either 1) find an advanced language implementation of a language like Scheme or ML which compiles directly into native machine code for his chosen hardware processor; or 2) program his application in a less-advanced stack-oriented language such as C; or 3) write a Scheme or ML compiler which compiles directly into native machine code; or 4) write part or all of his application in assembly language for the native machine code. Options 1), 3) and 4) are very expensive, and option 2) is very difficult, error-prone, and most likely very non-portable.
Some of the options facing the writer of a compiler for a language having advanced features are 1) compile directly into native machine code, or 2) target an existing efficient implementation of C or Fortran. Option 1 can result in very efficient execution performance, but is very expensive. Option 2 can, and has been, done, but has significant problems of its own.
There are major problems implementing programs requiring tail recursion, first-class continuations, and garbage collection in a stack-oriented language like C that does not already have these features. Implementing tail-recursion can sometimes be done by converting the recursion into iteration, which can only be done within a single `block` compilation unit, and sometimes not even then [Bartlett89]. A more general method for achieving proper tail recursion in a stack-oriented language uses a trampoline, also called a dispatch loop. A trampoline is an outer function which iteratively calls an inner function. The inner function returns the address of another function to call, and the outer function then calls this new function. In other words, when an inner function wishes to call another inner function tail-recursively, it returns the address of the function it wants to call back to the trampoline, which then calls the returned function. By returning before calling, the stack is first popped so that it does not grow without bound on a simple iteration. Unfortunately, the cost of such a trampoline function call is 2-3 times slower than a normal subprogram call, and it requires that arguments be passed in global variables [Tarditi90]. Another alternative is to tamper with the C compiler itself, but this alternative is also not portable.
Implementing first-class continuations on top of a stack-oriented language like C typically requires non-portable machine language access to the details of the stack format [Bartlett89]. Furthermore, many mutable objects such as assignable cells cannot be allocated on the stack due to the multiplicity of copies of the stack that can exist, so some optimizations are impossible to perform [Clinger88].
Implementing garbage collection on top of a stack-oriented language like C requires either the use of a secondary stack [Yuasa90] [Chailloux92] and/or the use of a conservative garbage collector [Boehm88] [Bartlett88], which may be both inefficient and insecure. Discussions of additional references are attached hereto in Appendix B ("Additional Related Art"). References from Appendices A, B, and C are incorporated by reference herein.
In summary, to utilize and/or compile advanced language features such as tail recursion, first-class continuations, or garbage collection, we have either Appel's elegant, efficient and expensive method of utilizing native machine language [Appel88] [Appel89] [Appel90] [Appel92], or we have cheaper methods of using stack-oriented languages like C which are crude, complex, and potentially catastrophic [Bartlett88] [Bartlett89] [Boehm88] [Tarditi90] [Chase88]. The existing art tries to either ignore the stack [Sussman75] [Appel90] [Tarditi90], or to utilize the normal Last-In, First-Out (LIFO) behavior of the stack as much as possible through complex optimizations [Steele78] [Bartlett89] [Tarditi90].
There accordingly exists a need for an improved method to utilize and/or compile advanced language features such as tail recursion, first-class continuations, and garbage collection which is both efficient and reliable ("crash proof"), and cost effective to implement.