1. Field of the Invention
The present invention relates to the field of computer systems. More specifically, the present invention relates to register allocation, instruction scheduling and loop unrolling performed by compilers of these computer systems.
2. Background
Traditionally, register allocation and instruction scheduling are performed independently with one process before the other during code generation. There is little communication between the two processes. Register allocation focuses on minimizing the amount of loads and stores, while instruction scheduling focuses on maximizing parallel instruction execution. Typically, the process performed first is favored. If register allocation is performed first, it creates false dependencies for the instruction scheduler. On the other hand, if instruction scheduling is performed first, it might pessimize the code as a result of subsequently determining that more spilling is necessary. Intuitively, it would appear obvious that a coordinated approach to register allocation and instruction scheduling would increase the overall program execution efficiency.
In Integrated Register Allocation and Instruction Scheduling for RISCs, ACM pp. 122-131, (ACM 0-89791-380-9/91/00003-0122), David G. Bradlee, et al., described the results of their comparative study on three approaches to register allocation and instruction scheduling for RISC computers. The first approach is a simple postpass approach, in which global register allocation is performed prior to instruction scheduling. The second approach is a variation of integrated prepass scheduling, in which the scheduler is invoked before register allocation. The scheduler must schedule within a predetermined local register limit. The third approach is an integrated register allocation and instruction scheduling by giving the register allocator cost estimates that quantify the effect of its allocation choices on the subsequently generated schedule.
More specifically, under the integrated approach, a prescheduler is invoked to compute a schedule cost estimate for each value in a series of register limits for each basic block of the program being compiled. A schedule cost estimate is the estimated number of machine cycles required to execute the instructions in a basic block, while remaining within the particular register limit. Then during register allocation, the register allocator allocates registers to global pseudo registers based on the incremental costs to scheduling as well as spill costs. The incremental cost to scheduling is computed using the schedule cost estimates, which reflects the additional cycles required to execute a basic block if it is scheduled with a reduced register limit. Finally, the scheduler schedules each basic block allocating registers to local pseudo registers within the register limit for the basic block.
Bradlee, et al. described a 12% improvement in execution performance for their benchmark between the second and the first approach, and between the third and the first approach. In light of the insignificant difference in execution performance improvement between the second and the third approach, Bradlee, et al. concluded that some level of integration between register allocation and instruction scheduling is beneficial, however, highly integrated register allocation and instruction scheduling perhaps is unnecessary.
Thus it is desirable to have a loosely coupled register allocation and instruction scheduling approach that is relatively easy and inexpensive to implement, and yet it produces significant improvement to the execution performance of the generated code. As will be disclosed, the present invention provides a method and apparatus for integrated register allocation, instruction scheduling, instruction combining, and loop unrolling that achieves the above described desired results.