The present invention relates to digital data processing systems, and in particular to compilers used to compile programming code for execution on a computer processor.
A modern computer system typically comprises a central processing unit (CPU) and supporting hardware necessary to store, retrieve and transfer information, such as communication buses and memory. It also includes hardware necessary to communicate with the outside world, such as input/output controllers or storage controllers, and devices attached thereto such as keyboards, monitors, tape drives, disk drives, communication lines coupled to a network, etc. The CPU is the heart of the system. It executes the instructions which comprise a computer program and directs the operation of the other system components.
From the standpoint of the computer""s hardware, most systems operate in fundamentally the same manner. Processors are capable of performing a limited set of very simple operations, such as arithmetic, logical comparisons, and movement of data from one location to another. But each operation is performed very quickly. Programs which direct a computer to perform massive numbers of these simple operations give the illusion that the computer is doing something sophisticated. What is perceived by the user as a new or improved capability of a computer system is made possible by performing essentially the same set of very simple operations, but doing it much faster. Therefore continuing improvements to computer systems require that these systems be made ever faster.
In the very early history of the digital computer, computer programs which instructed the computer to perform some task were written in a form directly executable by the computer""s processor. Such programs were very difficult for a human to write, understand and maintain, even when performing relatively simple tasks. As the number and complexity of such programs grew, this method became clearly unworkable. As a result, alternate forms of creating computer programs were developed. In particular, a number of high-level languages for writing computer programs have been created.
High-level languages vary in their characteristics, but all such languages, are intended to make it easier for a human to write a program to perform some task. Typically, high-level languages represent instructions, fixed values, variables, and other constructs in a manner readily understandable to the human programmer rather than the computer. Such programs are not directly executable by the computer""s processor. In order to run on the computer, the programs must first be transformed into a form that the processor can execute.
The process of transforming a high-level language program into a processor-executable form is called compilation. In general, a compiler is a special-purpose program which transforms one representation of a computer program into another representation. Usually, this means transforming a human-readable program in a high-level language (source code) to a processor-executable form (object code). However, some compilers perform only an intermediate step in the full transformation from high-level language to executable object code.
A computer processor typically contains a fixed number of high-speed registers for the temporary storage of data. These are the work space of the processor, much as a desk might be the work space of a person. Numerical and logical operations performed by the processor are performed on data in its registers. Demand for registers is typically at a premium. An office worker can not keep on the desk every paper or other object he or she might possibly need; some things must go in drawers, others in file cabinets, still others may be put in archive boxes. Similarly, the computer processor usually keeps only a small amount of data in its registers, and stores most of it in caches, main memory, or on mass storage devices such as disk drives.
Every variable used in a computer program must at some time be placed in a processor register for execution of some instruction. Since the number of registers in a processor is usually much smaller than the number of variables in a program executing on the processor, it is not possible to simply assign a different register to each variable. The process of assigning variables to registers at various parts of the program""s instruction stream is known as register allocation. Just as it would be inefficient for our office worker to clutter the desk with infrequently used items, the performance of a computer system depends upon the efficiency of the register allocation. If registers are allocated in an inefficient manner, the computer will keep infrequently used data in its registers while it spends an inordinate amount of time retrieving data it needs from memory or remote storage locations.
Register allocation is typically performed as part of the program compilation process. Register allocation is a problem of almost infinite complexity. It is virtually impossible to discern in advance what is the best register allocation for a program, because, even laying aside the enormous number of possible permutations of register assignments, there is no foolproof way of knowing whether one allocation is better than another without knowing what data will be input to a program and what branches will be taken when the program executes. The best one can hope for is a good register allocation.
Numerous prior art register allocation schemes exist. One common method uses the concept of xe2x80x9clive rangesxe2x80x9d or xe2x80x9clifetimesxe2x80x9d of program variables. The xe2x80x9clive rangexe2x80x9d of a particular program variable is the span of instructions for which the variable contains valid data, and may be computed in a number of different ways. An interference graph is constructed of all live ranges in an instruction stream. Hardware registers are assigned to nodes (representing live ranges) in the interference graph, a process known as xe2x80x9ccoloringxe2x80x9d. This is analogous to coloring a map, where adjacent countries are given different colors, since adjacent nodes in the interference graph must be assigned different hardware registers. If a live range in the interference graph cannot be colored, it must be spilled, meaning that the variable must be stored in another location (usually memory) rather than keeping its value in a register. Since the processor can only operate on data stored in its registers, spilling a live range variable implies that the value must be loaded from memory into a register when it is needed, and stored back to memory when changed.
Spilling a live range variable requires the insertion of instructions into the instruction stream to perform the necessary stores to memory and loads from memory. These instructions are known as xe2x80x9cspill codexe2x80x9d. The presence of spill code in the instruction stream reduces the performance of the computer program, because additional instructions are added, and because each execution of a load implies that the processor must wait to receive the data.
It is to be expected that the current trend toward increased programming complexity will continue, and that the problem of efficient register allocation will increase in scope absent improved methods for assigning program variables to registers.
In accordance with the present invention, register allocation during program compilation is accomplished by determining a set of spill candidates, by evaluating a cost function for each spill candidate using a plurality of spill strategies, and by selecting the spill candidate having the lowest cost function value.
In the preferred embodiment, the set of possible spill candidates is determined by the Chaitin method of constructing an interference graph of all live ranges of symbolic registers. Live ranges (nodes) are then removed from the graph iteratively, and placed on a stack. Nodes are eligible for removal from the graph if the number of edges connecting a node to other nodes is less than the number of processor registers. Once a live range node has been removed, all its edges are removed, which may have the effect of making other nodes eligible for removal. When it is not possible to remove any more nodes, a spill candidate is selected, and removed from the graph. If additional nodes can then be removed from the graph, they are removed. The process repeats until no more live ranges remain.
In the preferred embodiment, multiple strategies for spilling are represented as varying measures of xe2x80x9cdistancexe2x80x9d N between successive mentions of a symbolic register, a spill strategy N signifying that spill code is introduced whenever the xe2x80x9cdistancexe2x80x9d occurring between two successive mentions of a symbolic register equals or exceeds N. Specifically, in the preferred embodiment the measure of distance is the number of live ranges of other symbolic registers which terminated (known as the number of xe2x80x9cdeathsxe2x80x9d) between successive mentions of the symbolic register under consideration.
In the preferred embodiment, the cost function is a summation of the number of each type of operation (such as a load or a store) introduced by the spill times the cost of the type of operation times a weighted frequency factor. This raw cost is further refined by dividing by a benefit factor representing an estimate of the benefit of removing the node from the graph. Preferably, the benefit factor is the number of edges connected to the node representing the symbolic register, known as the degree of the node.
The details of the present invention, both as to its structure and operation, can best be understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which: