Among the standard works on compiler construction, Aho et al, "Principles of Compiler Design", Addison-Wesley Publishing Co., copyright 1977, and Waite et al, "Compiler Construction", Springer-Verlag, copyright 1984, point out that the conversion of a computer source language such as PASCAL or FORTRAN into code executable by a target machine is done through a series of transformations. First, the source symbol string is lexically analyzed to ascertain the atomic units or words for translation, and then syntactically analyzed for ascertaining the grammatical relations among the words. The output is expressed in the form of a "parse tree". The parse tree is transformed into an intermediate language representation of the source code. Most compilers do not generate a parse tree explicitly, but form the intermediate code as the syntactic analysis takes place. Optimization is then applied to the intermediate code after which the target machine-executable or object code is generated.
Among the tasks that a compiler must perform are the allocation and assignment of computing resources so that the computation specified by a stream of source code instructions can be efficiently completed. Among the "resources" available include computational facilities such as an ALU, input/output, memory including registers, operating system elements, etc. The objectives of the optimization portion of the compiler are to (a) shrink the size of the code, (b) increase the speed of execution where possible, and (c) minimize costs through efficient resource allocation. A schedule of resource use or consumption pattern is then embedded in the code being compiled.
It is well known that an instruction stream can be mapped onto graphical structures and advantage taken of graph-theoretic properties. Code sequences may be analyzed by way of the graphical properties of basic blocks with respect to local optimization, and flow graphs of blocks with respect to global optimization.
A basic block is a sequence of consecutive statements. This sequence may be entered only at the beginning and when entered the statements are executed in sequence without halt or possibility of branch, except at the end thereof.
A flow graph describes the flow of control among basic blocks. The flow graph would, for example, show the looping, branching, and nesting behavior among basic blocks necessary for iterative or recursive computation.
A directed acyclic graph (DAG) of the data dependencies is a data structure for analyzing basic blocks. For instance, a=S+c is rendered by b+c as starting nodes, each connected to common node c through respective edges. It is not a flow graph, although each node or (basic block) of a flow graph could be represented by a DAG.
"Live variable analysis" refers to a set of techniques for ascertaining whether a name has a value which may be subsequently used in a computation. A name is considered "live" coming into a block if the name is either used before redefinition within a basic block, or is "live" coming out of the block and is not "redefined" within the block. Thus, after a value is computed in a register, and is presumably used within a basic block, it is not necessary to store that value if it is "dead" at the end of the block. Also, if all registers are full and another register is needed, assignment could be made to a register presently containing a "dead" value.
Conceptually, the first compiler transformation consists of mapping strings of source code onto a flow graph, each of whose nodes are basic blocks and whose control and data path relationships are defined by the directed edges of the flow graph. Optimization in the allocation and assignment of resources can be considered first at the local or basic block level, and then at the global or flow graph level.
In local optimization, each basic block is treated as a separate unit and optimized without regard to its content. A data dependence graph is built for the basic block, transformed, and used to generate the final machine code. It is then discarded and the next basic block considered. A "data dependence graph" is a graph-theoretic attribute representation within a basic block. Since a basic block cannot contain cycles, all the data dependence graphs' basic blocks can be represented by DAGs. Parenthetically, a DAG is not necessarily a tree. Illustratively, if a basic block consisted of two computational statements x=u+v and y=u+w, the DAG would not be a tree although it is acyclic. Lastly, global optimization performs global rearrangement of the flow graph and provides contextual information at the basic block boundaries.
A computer includes memory, the fastest form of which is the most expensive. A finite number of physical registers store operands for immediate use for computation and control. Computer instructions operating register to register are the fastest executing. If a register is unavailable, an intermediate result must either be loaded to main memory where the bulk of programs and data are located, or loaded from said main memory into a register when a register becomes available. Loads and stores to registers from memory take a substantially longer time. Thus, when evaluating either a flow graph or a basic block, one objective is to keep as many computational names or variables in the registers or to have a register available as needed.
Register allocation involves identifying the names in the software stream which should reside in registers (i.e. the number of registers needed); while assignment is the step of assigning registers to nodes following an underlying scheme, rule, or model. Among the allocation strategies used in the prior art was to have the assignment fixed; that is, where specific types of quantities in an object program were assigned to certain registers. For instance, subroutine links could be assigned to a first register group, base addresses to a second register group, arithmetic computations to a third register group, runtime stackpointers to a fixed register, etc. The disadvantage of such fixed mapping is that register usage does not dynamically follow execution needs. This means that some registers are not used at all, are overused, or are underused.
Global register allocation relates to the observation that most programs spend most of their time in inner loops. Thus, one approach to assignment is to keep a frequently used name in a fixed register throughout a loop. Therefore, one strategy might be to assign some fixed number of registers to hold the most active names in each innerloop. The selected names may differ in different loops. Other nondedicated registers may be used to hold values local to one block. This allocation and assignment has the drawback that no given number of registers is the universally right number to make available for global register allocation.
Chaitin et al, "Register Allocation Via Coloring", Computer Languages, Vol. 6, copyright 1981, pp. 47-57, Pergamon Press Limited, and Chaitin, "Register Allocation and Spilling Via Graph Coloring", Proceedings SIGPLAN 82, Symposium on Compiler Construction, SIGPLAN Notices, copyright 1982, pp. 98-105, describe a method of global register allocation across entire procedures. In Chaitin, all registers but one are considered to be part of a uniform pool, and all computations compete on the same basis for these registers. Indeed, no register subsets are reserved.
Chaitin points out that it is intended to keep as many computations as possible in the registers, rather than in storage, since load and store instructions are more expensive than register-to-register instructions. Chaitin notes that it is the responsibility of code generation and optimization to take advantage of the unlimited number of registers, i.e. considered as a pool, allowed in the intermediate language in order to minimize the number of loads and stores in the program.
The critical observation in Chaitin is that register allocation can be analyzed as a graph-coloring problem. The coloring of a graph is an assignment of a color to each of its nodes in such a manner that if two nodes are adjacent (connected by an edge of the graph), they have different colors. The "chromatic number" of the graph is the minimal number of colors in any of its colorings. In Chaitin, register allocation utilizes the construct termed a "register interference graph". Two computations or names which reside in machine registers are said to "interfere" with each other if they are "live" simultaneously at any point in the program.
Chaitin's graph-coloring method includes the steps of (a) building an interference graph from the names with reference to a specific text ordering of code; (b) ascertaining the chromatic number for the graph, and coloring (assigning registers to nodes) if the chromatic number does not exceed the number of available registers, otherwise reducing the graph by retiring a node (excising the node and its connecting edges) having the highest in/out degree; (c) repeating step (b) until the values converge, and (d) accounting for and managing the "spills" by embedding in the compiled code stream appropriate writes to and loads from memory.