This invention relates to computer program compilers and more particularly a computer program compilation system that supports register allocation across procedure and compilation unit boundaries where there are global variables and a limited number of registers for storage and manipulation of data.
In a traditional compiler, register allocation is performed on each procedure one at a time. In some compilers, the register allocator has access to register allocation information from other procedures within the same compilation unit. The compiler can use this information to improve the register allocation in the callers of these routines. This type of technique is limited in scope to the procedures of a single compilation unit.
The traditional intraprocedural register allocation process is effective, but in the absence of interprocedural information the following situations occur:
.cndot. Local values in different procedures are assigned to the same register. As a result, procedures must execute code to save and restore these registers in order to preserve the values needed by the calling procedure. PA0 .cndot. Global variables are referenced out of different registers in different procedures. This requires a modified value of a global variable to be stored to memory before any procedure call, and loaded back from memory before any subsequent use of that variable. This also requires each procedure which references that variable to load the variable from memory if it is used before being redefined, and to store the global variable to memory before the exit point if the variable is modified within that procedure. PA0 .cndot. Status registers are registers which are designated to hold specific values which may not be used to hold variables or other temporary values. Examples include a stack pointer and a global data pointer. PA0 .cndot. Caller-saves registers are registers which may be used within a procedure to hold values, but these values are not guaranteed to remain unchanged after executing a call to another procedure. These registers may be used by a procedure without being preserved in memory before they are used. The name "caller-saves" refers to the fact that the caller of a procedure must save any needed values in these registers so the called routine may use those registers. PA0 .cndot. Callee-saves registers are registers which may be used within a procedure to hold values, and these values are guaranteed to remain unchanged after executing a call to another procedure. However, the values in these registers must be spilled before they are used and then restored to the register before exiting the procedure. The name "callee-saves" refers to the fact that the called routine is responsible for saving these registers before they are used.
For most programming languages, improving this situation is complicated by the need to support multiple compilation units. For example, if one wishes to keep a certain global variable in a register when compiling module A, one must ensure that any reference to that variable in a different module uses the same register.
One possible solution is to delay register assignment until link time, when the code for the entire application is visible. This solution is difficult to implement with traditional compiler architectures, however, because of the need for dataflow and live range information at register allocation time. Moreover, computing this information would create an unreasonable delay each time a user needed to re-link an application.
There are two known significant research efforts that have addressed the weaknesses of procedure-at-a-time register allocation. The first was carried out at DEC's Western Research Lab in 1986 and described by David W. Wall in an article entitled "Global Register Allocation At Link Time" in the Proceedings of the SIGPLAN '86 Symposium On Compiler Construction, SIGPLAN Notices, Vol. 21, No. 7, July 1986, pages 264-275. In this technique, the compiler does a simple register allocation on each procedure and generates register relocation information for the linker. The user may optionally enable interprocedural register allocation at link time. To promote a variable to a register, the linker only needs to follow the prescribed relocation actions. This technique showed some good results. Some benchmarks improved by as much as 8% on a 64-register RISC machine, with a majority of the benefit attributed to the promotion of global variables.
Global variable promotion is an optimization technique where memory references to global variables are converted into register references. In effect, the global variable is promoted from being a memory object to a register object. Traditional compilers sometimes promote global variables to registers locally within a procedure. Such locally promoted global variables are still accessed out of memory across procedures. Before procedure calls and at the exit point, the compiler inserts instructions to store the register containing the promoted global variable back to memory. Similarly, just after procedure returns and at the entry point, the optimizer inserts instructions to load the promoted global variable from memory to register.
The second significant research effort was produced at MIPS Computer Systems and described by Fred C. Chow in "Minimizing Register Usage Penalty at Procedure Calls" published in Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, July 1988, pages 85-94 and also in an article authored with others in "Cross-Module Optimizations: Its Implementation and Benefits" published in the Proceedings of the Summer 1987 USENIX Conference, pages 347-356. In the MIPS system, the multiple compilation unit problem is solved by exposing an intermediate code representation to the user. Then, instead of linking object code, the user must link the intermediate code files into a single, large intermediate program file. The intermediate code linker then completes the code generation and optimization process. As part of this process, the optimizer tries to minimize register spill by performing register allocation on procedures in a reverse hierarchical order and propagating register usage information upwards in the call graph. This technique showed generally positive results, although there were exceptions noted. In one example discussed, this process resulted in object code which executed more slowly than a version compiled without interprocedural register allocation. There are other computer systems and compilers that have been implemented which use a similar technique within a single compilation unit.
On many contemporary computer architectures, machine registers are divided by software conventions into three classes: status registers, caller-saves registers, and callee-saves registers.
In the absence of interprocedural information, callee-saves register spilling is necessary in every procedure which needs to use a register of that class. This creates significant overhead in many programs.
Some other references to related work include:
"LISP on a Reduced Instruction Set Processor: Characterization and Optimization", by P. A. Steenkiste of Stanford University Computer Systems Laboratory, PhD Thesis, Chapter 5, March 1987. This approach is similar to that of MIPS, except that it reverts to ordinary intraprocedural register allocation when the interprocedural registers are exhausted in upper regions of the call graph.
"Data Buffering: Run-Time Versus Compile Time Support" by Hans Mulder, Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 3-6, 1989, pages 144-151. This approach is also similar to that of MIPS except that it is limited in scope to single compilation units.
"The Impact of Interprocedural Analysis and Optimization in the R.sup.n Programming Environment" by Keith D. Cooper, Ken Kennedy, and Linda Torczon of Rice University. Published in the ACM Transactions on Programming Languages and Systems, October 1986, pages 491-523. This paper describes a program compiler which computes interprocedural optimization information, but does not address the register allocation problem.
Hewlett-Packard's Apollo Division uses an interprocedural register allocation scheme within a single compilation unit in their DN10000 architecture compilers. As with the references above, except for the DEC paper, this approach does not attempt to keep global variables in registers across procedures.
What is needed is a method and apparatus for optimizing register usage where there is a limited number of available register resources in a computer processor and where a plurality of procedures and variables are involved.