A computer system typically includes, among other things, a processor as a Central Processing Unit (CPU) and a separate memory system (i.e., main memory) to store information processed by the CPU. One problem of this computer architecture is that the performance of the processor tends to be negatively impacted by the main memory. This is due to the fact that the processing speed of a typical main memory tends to be slower than that of a processor. This typically causes the processor to stall when attempting to access a location in the main memory.
In order to achieve higher performance for computer processors, a local memory (or cache) has been proposed to be included within the processor to boost the performance of the processor. The local memory is organized like high-speed registers. FIG. 1 shows the memory layout of one prior local memory 10. As can be seen from FIG. 1, the local memory 10 can be viewed as indexed register files. Any specific local memory entry (e.g., the entry 12) is selected based on the value in a base address register plus an offset. Because the local memory 10 is divided into several continuous blocks, the value of base address register is required to be aligned on block size. The local memory 10 can be read and written as fast as general registers, and supplies to the execution data-paths as source operands and receives results as destination operands.
Compilation is a process in which a compiler transforms source code into a processor-executable native or machine code. During compilation of a program, every variable used in the program must at some time be placed in a processor register for execution of some instruction. This is referred to as register allocation. However, a computer processor typically only has a limited number of registers that is usually much smaller than the number of variables in a program executing on the processor. This makes it impossible to simply assign a register to each variable.
To solve this problem, every variable is placed in a “symbolic register” by the compiler. The compiler then only places those symbolic registers needed for the current execution in the hardware registers and spills all other symbolic registers to another storage when there are some conflicts of hardware registers (usually the main memory) and reload them only when needed. This technique is referred to as “spilling”. The inclusion of the local memory allows the compiler to use the faster local memory instead of main memory as the spilling home location, thus reducing the cost of reloading and storing.
One problem associated with the spilling-to-local-memory technique is that if the symbolic registers are not stored in proper locations within the local memory, it may cause a relatively large number of initialization operations to base address register when accessing the spilled registers. As is known, the initialization operation to base address register is a relatively expensive operation (e.g., 3 cycles delay between the write to the base address register and the value changed on IXP). Thus, the relatively large number of initialization operation typically negatively impacts the runtime performance of the compiled program. FIG. 2 illustrates this problem.
As shown in FIG. 2 for the purpose of illustration, a local memory block is assumed to only contain two entries and three spilling home locations for symbolic registers A, B and C that need to be spilled to the local memory. The spilling order of these spilling home locations is also shown in FIG. 2. The spilling home locations A and B are in the memory entries 21 and 22 of one memory block while the spilling home location C is in a different memory block that contain the memory entry 23 (the other memory entry of that memory block is not shown in FIG. 2). For this spilling order, the spilling home locations A and B can be accessed with the same base address while the home location C must be accessed with a different base address. FIG. 2 also shows the access order (i.e., A, B, A, C, A, C) of the home locations (either for spilling or reloading).
In this case and as can be seen from the pseudo code accessing sequence in FIG. 2, four initialization operations to base address register are needed for the spilling order and access order as shown in FIG. 2. It is also assumed here that each instruction can access only one register spilling location with constant address, which is always true for each spilling and reloading. But if the spilling order of these home locations could be rearranged in another way (e.g., putting the home locations A and C into the same local memory block), then only three initialization operations to base address register may be needed.
Thus, there exists a need for a method and system of optimally allocating register locations in a memory during compilation of a program code in order to increase runtime performance of the compiled code.