This invention relates to optimizing computer code, and more particularly to a method of register allocation in an optimizing compiler or the like.
A compiler usually operates by accepting as an input computer language in one form, e.g., source code in a high level language, and producing as an output computer language of another form, usually object code for a particular target architecture. The objective, of course, is for the high level language to be as easy as possible for the programmer to use in expressing the algorithms, and yet make the object code as fast and efficient as possible at runtime. Alternatively, the compiler may operate as a translator for converting assembly code written for one machine to object for another machine, as set forth in copending applications Ser. No. 666,196, filed Mar. 7, 1991 by Richard L. Sites, for "Automatic Flowgraph Generation for Program Analysis and Translation", now pending, or Ser. No. 666,083, filed Mar. 7, 1991 by Thomas R. Benson, for "Use of Stack Depth to Identify Architecture and Calling Standard Dependencies in Machine Code", now pending. In either event, whether operating as a conventional compiler or as a code translator, the input code is first converted to an intermediate representation by doing a syntactical and lexical analysis, producing a stream of n-tuples or like data structures, where each tuple represents in intermediate language a primitive operation. A symbol table is also generated for all references to variables, labels, subroutines, etc. A flow graph is generated using the tuple stream and the symbol table, where the program is broken into blocks, with each block being a section of linear code with an entry at the beginning and exits (branch, etc.) at the end. The blocks are connected to form a representation of the program as written, in the flow graph. This intermediate language expression of the program is updated, annotated, and rearranged by the compiler to generate an improved representation of the program, using various optimization techniques, then a code generator produces object code from the intermediate representation as it has been optimized. At some point in this process, either before or after generation of the object code, actual register and memory references are substituted for the generic references used in the source code.
Most modern computers contain a form of high performance memory elements, called registers, that need to be used effectively to achieve high performance at runtime. The process of choosing language elements to allocate to registers and the data movement required to use them is called "register allocation." Register allocation has a major impact on the ultimate quality and performance of the code. A poor allocation can degrade both code size and runtime performance. The performance penalty caused by using a memory reference instead of a register reference, even for a processor using hierarchical memory with high speed cache, is perhaps several cycles, on average. The object, then, is to allocate the register usage so that a maximum amount of the code involves manipulation of operands in registers and that data movement between registers and lower levels of the memory hierarchy is minimized. However, finding a truly optimal solution has been proven to be computationally intractable. Two general approaches to global register allocation are the bin packing method and the graph coloring method. Register allocation by graph coloring was described by Chaitin et al in Computer Languages, Vol. 6, pp 47-57, and in U.S. Pat. No. 4,571,678 to Chaitin for "Register Allocation and Spilling via Graph Coloring". While graph coloring uses an n-squared algorithm to find a "good" solution (registers allocated for all quantities that need them) if one exists, the bin packing method uses a linear algorithm which runs faster but may miss some "good" solutions. It is the object of the present invention to obtain some register allocation solutions achievable with graph coloring, but using a method requiring less compiler execution time.
Allocation with lifetime holes as described herein is an improvement to one instance of the bin packing algorithm which allows a better register allocation by finding some of the "good" solutions not found by the standard bin packing algorithm, white retaining good compile time performance.