This invention relates generally to code generation, such as the generation of native machine code from an intermediate representation, and more particularly to such code generation that is performed in one pass, based on a greedy pattern-matching finite-state-machine scheme.
Traditionally, most computer programs have been specific to the type of computer on which they are intended to run, and more""specifically, to the type of processor within the computer. For example, an executable computer program compiled from a source code for the computer program may only execute on xc3x9786 processor-based computers, such as Intel Pentium processor-based computers. Generally, this means that software developers may have to recompile their programxe2x80x94and even may have to rewrite some of the source code for their programxe2x80x94if they wish to xe2x80x9cportxe2x80x9d the program so that it can run on a different processor-based computer.
However, more recently, there has been broader interest in computer languages that are intended to allow a source code for a program to be written and compiled only once, and executed on any type of computer, and by any type of processor. Programs written in these languages run on a xe2x80x9cvirtual machine,xe2x80x9d (VM) as opposed to a specific type of machine. Such a computer language, for example, is the Java computer programming language. With these types of computer languages, source code is compiled into an intermediate representation (IR) or VM code orxe2x80x94when the code is a stream of bytesxe2x80x94byte code. For the IR or VM code to be run on a specific computer having a specific type of processor, a program on the computer either interprets IR or VM code directly, or a just-in-time (JIT) translator compiles the IR or VM code to native code on an as-needed basis for subsequent execution. In general, the compiled code runs faster than interpreted code, but takes longer to load and requires more space (e.g., memory). Thus, conventional wisdom dictates that the space/time tradeoff favors the interpreter approach where space is scarce, but favors the JIT compilers when speed is more important than size or simplicity. Interpreters typically give up a factor of 10xc3x97 in execution speed compared to JIT-translated code.
JIT compilers are examples of a class of computer programs called code generators. Code generators thus take an input, such as an IR or VM code, and based on that input, generate a corresponding output of native code that can be directly run on a given type of processor. Furthermore, another type of computer program is called a code-generator generator, which based on a specific specification or grammar, generates a code generator that can read IR or VM code, for example, according to that specification or grammar, and generate native code. The code-generator generator program may generate a code generator program in a source code such as code written in the C programming language, which must then be compiled by a further compiler to yield an executable program.
With respect to code generators, one type of code generator is referred to as a pattern-matching code generator. Pattern-matching code generators map patterns of intermediate representation operators to equivalent target instructions. More specifically, tree pattern-matching code generators generally use tree patterns and, optionally, target-machine instruction costs to select least-cost (viz., optimal) mappings via dynamic programming, which requires two tree-walk passes: a first bottom-up pass for the dynamic programming, and a second top-down pass for selecting the least-cost match. Other systems heuristically search for better target instructions to emit. In general, however, these types of code generators can be large (viz., they can take up a lot of space in the host machine), and are slow, even if their generated code is generally good (viz., optimal).
Tree-pattern-matching technologies combined with dynamic programming generally yield efficient, optimal local code generation for tree-based intermediate representations. However, all dynamic programming systems require two passes over the intermediate representation: the first pass annotating the tree with dynamic programming information, and the second pass selecting the optimal match. But, while generating optimal code from trees, these code generators do so at the cost of two passes over the intermediate representation, and thus are relatively slow. Furthermore, the second pass usually requires a top-down tree walk of the intermediate representation, which can require transforming the input (the intermediate representation) into a tree data structure. This transformed input is generally larger than the original input, and allocating and initializing the structure can significantly increase the cost of code generation.
Thus, there is a need for faster code generators that-nevertheless generate optimal or near-optimal local (native) code. For this and other reasons, then, there is a need for the present invention.
The invention relates to one-pass, greedy-pattern-matching, finite-state-machine code generation. In one embodiment of the invention, a computer-implemented method for generating local code from intermediate code first includes receiving a current element (e.g., a current instruction or literal operand) of a postfix-notated intermediate code. The method receives a current element of a postfix-notated intermediate code and matches the current element to a xe2x80x9cbase rulexe2x80x9d within a predetermined grammar for translating intermediate code to local code. The rules that comprise a tree grammar can be partitioned into xe2x80x9cbasexe2x80x9d and xe2x80x9cchainxe2x80x9d rules. Chain rules match a single non-terminal, but base rules include operators. It can be assumed without loss of generality that each base rule matches a single operator and the non-terminals that were matched by its non-literal operands.
In one embodiment, matching a base rule includes matching a most recent element to a chain rule within the grammar and applying the chain rule to the most recent element to determine whether the current element matches a given base rule. Next, the method selects and applies a matching base rule based on a predetermined criteria, such as using the first matching base rule, or using the least costly matching base rule. A local code is then generated according to the matching base rule applied to the current element. In one embodiment, this includes initially generating a local code according to the chain rule applied to the most recent element. A next element of the intermediate code is then advanced to the current element. In one embodiment, this process is repeated until the current element is past the last element of the intermediate code.
The described embodiment results in generation of local code in only one passxe2x80x94that is, once the intermediate code has been processed one time, local code has been generated such that it can be executed, without requiring a second pass of the intermediate code prior to generating local code. This process results in faster execution as compared to two-pass code generators. The described embodiment is greedy in its pattern matching in that if a base rule can be applied to the current element, it is according to a predetermined criteria (least costly, or first base rule matched)xe2x80x94thus, the element is not xe2x80x9csavedxe2x80x9d to determine if a different rule can be applied later on (once subsequent elements are known, for example). The described embodiment is a finite-state machine in that it saves only one state at any given time for application of chain rules, as opposed to theoretically infinite states in a code generator that requires the use of a stack. That is, to determine if a base rule can be applied to a current element, chain rules in one embodiment are examined to determine whether they can be applied to a most recent element (such that a given base rule is then applicable to the current element). The greedy pattern matching and finite state machine aspects of embodiments of the invention also provide for relatively fast code generation.