The present disclosure relates generally to data processing and, more particularly, to retrieval of expressions.
Most computer programs need to process large amounts of different data items. However, most CPUs can only perform operations on a small fixed number of “slots” called registers. Even on machines that support memory operands, register access is considerably faster than memory access Therefore, it is more efficient to load data items to be processed into registers and unload them to memory when they are needed.
For a computer to execute a program, the program must be compiled into a machine-readable form. In a compiler, source code is translated into a machine-readable-executable program. An example of a compiler is shown in FIG. 1. The compiler comprises a program which reads statements, i.e., source code written in a human-readable programming language, such as C++, and translates them into a machine-readable-executable program. The compiler includes four main components: a parser 10, an optimizer 20, a register allocator 30, and a code generator 40. The parser 10 translates the source code into an intermediate language (IL), which is understood by the compiler. The optimizer 20 performs various optimizing operations on the intermediate language to improve the execution performance of the compiled code. The register allocator 30 rewrites the symbolic registers generated in the intermediate language program to hardware registers defined on the target machine (computer). The code generator 40 translates the instruction in the intermediate language into executable instructions for the target machine and produces an executable program.
The register allocator 30 multiplexes a number of target program variables into a small number of CPU registers. The goal is to keep as many operands as possible in registers to maximize the speed of execution of the software program. Register allocation can happen over a basic block of a function within a program (local register allocation) or over a whole function/procedure (global register allocation) of a program.
Register allocation presents challenges because the number of variables in a typical program is much larger than the number of registers in a processor. So, the contents of some variables have to be saved or “spilled” into memory. The costs of spilling may be minimized by spilling the least frequently used variables first. However, it is not easy to know which variables will be used the least. Also, hardware and operating systems may impose restrictions on the usage of some registers.
It is typical for compiler optimizers to perform expression commoning (coalescing) early in the compilation process. The benefits of commoning are two-fold. First, the numbers of expressions that need to be processed by the optimizer are minimized, improving compilation overhead (desirable in dynamic compilers, such as a Just in Time (JIT) compiler). Second, this removes redundant computation from the resulting compiled code.
Despite the advantages of expression commoning, it can have negative effects on register allocations. In modern processing architectures, differences in data access time for values kept in registers as compared to those in memory may be quite high. Thus, compilers need efficient register allocation strategies to improve the runtime performance of code. In expression commoning, expressions are held in registers for longer durations in the compiled code. This may result in a greater overlap of expressions, hence greater competition for computational resources, such as registers. If the number of co-existing expressions exceeds the number of physical registers on the device, so called “register spilling” occurs, i.e., the compiler has to transfer or “spill” some expressions from registers to memory. Spilling uses stack-local memory to cache expressions until their next use. Thus, overhead for storing and reloading expressions are added to the compiled code.
Register rematerialization is a technique that has been used to improve register allocation by improving “spill” code generation. Rematerialization selectively reverses commoning by breaking up an expression into several copies. Rematerialization saves time by recomputing a value instead of loading it from memory. Thus, it is generally used when expressions can be easily re-constructed/recomputed at a lower cost than storing and retrieving them from memory. The typical use is for constant expressions that are generally cheap to construct.
Previous approaches to rematerialization have included identifying easy-to-re-compute values that are known to be constant for a given duration in the programs, for example immediate or target addresses. The recomputation involved in rematerialization adds overhead and slows down compilation.
Another approach begins by spilling aggressively to facilitate subsequent register allocation. This approach requires heavy overhead and slows down compilation, making it unsuitable for a dynamic compiler, such as JIT compiler.
Thus, there is a need for a technique for selectively retrieving expressions from memory in a manner that is that is effective and efficient.