1. Field of the Invention
The present invention relates to the translation of program code for a digital computer, and more particularly to the translation of program code from one language to another in cases where the location of all of the program code to be translated is not known until the program code is actually executed.
2. Description of the Background Art
Computer language translation programs are well known for translating high-level languages such as Pascal, Fortran, Cobol, PL/I or C into machine language. For these languages, programs are coded in an English-like style. A language translation program called a compiler reads the high level language program (called the source program) and translates it into a machine language program (called the object program).
One major advantage of high level languages, besides the ease with which they can express algorithms, is their machine independence. They hide the specifics of the hardware machine and instruction set. Nonetheless, there are a number of applications where machine language programming is desirable. To increase the execution speed of a program, it is often desirable 15 to write code for repetitively executed procedures in machine language to minimize the number of machine cycles that are needed to execute the procedures. Machine language programming is also required in many computer systems to directly control specific hardware features of a particular computer. For example, the parts of an operating system that manage memory and input/output devices are often written in machine language.
Machine language programming is usually written in assembly language code instead of binary code. Assembly language permits the programmer to specify machine operations using symbolic names for memory locations and instructions. A program called an assembler translates the assembly language program into binary machine code. The assembler does all of the work of remembering the values of symbols and the addresses of data elements. However, unlike the high level language, each assembly language instruction corresponds to exactly one machine instruction.
More recently there has arisen a need to translate machine language for one kind of computer to machine language for another kind of computer. This need has arisen due to rapid advances in computer hardware that have made new computer architectures more cost effective. In particular, for more than a decade most high performance computers for general purpose applications used a "complex instruction set architecture" (CISC) characterized by having a large number of instructions in the instruction set, often including variable-length instructions and memory-to-memory instructions with complex memory accessing modes. The VAX.TM. instruction set is a primary example of CISC and employs instructions having one to two byte opcodes plus from zero to six operand specifiers, where each operand specifier is from one byte to eighteen bytes in length. The size of the operand specifier depends upon the addressing mode, size of displacement (byte, word or longword), etc. The first byte of the operand specifier describes the addressing mode for that operand, while the opcode defines the number of operands: zero to six. The opcode itself, however, does not always determine the total length of the instruction, because many opcodes can be used with operand specifiers of different lengths. Another characteristic of the VAX.TM. instruction set is the use of byte or byte string memory references, in addition to quadword or longword references; that is, a memory reference may be of a length variable from one byte to multiple words, including unaligned byte references.
The CISC architecture provided compactness of code, and also made assembly language programming easier. When the central processor units (CPU) were much faster than memory, it was advantageous to do more work per instruction, because otherwise the CPU would spend an inordinate amount of time waiting for the memory to deliver instructions. Recently, however, advances in memory speed as well as techniques such as on-chip cache and hierarchical cache have eliminated the primary advantages of the CISC architecture. Therefore the selection of the instruction architecture is now dictated by the required complexity of the CPU for maximizing execution speed at reasonable cost. These considerations indicate that a reduced instruction set architecture (RISC) has superior performance and cost advantages.
Reduced instruction set or RISC processors are characterized by a smaller number of instructions which are simple to decode, and by the requirement that all arithmetic/logic operations are performed register-to-register. Another feature is that complex memory accesses are not permitted; all memory accesses are register load/store operations, and there are only a small number of relatively simple addressing modes, i.e., only a few ways of specifying operand addresses. Instructions are of only one length, and memory accesses are of a standard data width, usually aligned. Instruction execution is of the direct hardwired type, as distinct from microcoding. There is a fixed instruction cycle time, and the instructions are defined to be relatively simple so that they all execute in one short cycle (on average, since pipelining will spread the actual execution over several cycles).
Unfortunately there is a vast amount of computer software already written for established instruction architectures, and much of that software includes machine language programming that did not originate from the compilation of a high-level language. In these cases the software can not be "ported" to the new computer architecture by the usual method of re-compiling the source code using a compiler written for the new instruction architecture.
In some cases, assembly language code exists for the machine language programming of existing computer software. Therefore it should be possible to write a translator program for translating each assembly language instruction into one or more machine instructions in the new instruction architecture that perform the same basic function. The practicality of such a direct translation is dependent upon the compatibility of the new instruction architecture. For translating CISC code including VAX.TM. instructions to RISC code, for example, the practicality of the translation is improved significantly by innovations in the RISC CPU hardware and the RISC instruction set, as further described in Richard L. Sites and Richard T. Witek, "Branch Prediction in High-Performance Processor," U.S. application Ser. No. 07/547,589 filed Jun. 29, 1990, herein incorporated by reference.
In many cases existing computer software includes binary machine language code for which there does not exist a complete or coherent set of high-level or assembly language source code. This presents a very difficult problem of locating all of the binary machine code to translate. In the usual case a portion of the binary machine code in a program cannot be found prior to execution time because the code includes at least one execution transfer instruction, such as a "Jump" or "Call" instruction, having a computed destination address. At execution time, the destination address is computed, and execution is transferred from the instruction to the "missing" code.
In more unusual cases, some of the binary machine code in a program is not created until execution time. These unusual cases are typically due to poor programming techniques, although code is often created at execution time for security purposes, for example, as part of a "license check" routine. The "license check" routine, for example, writes a sequence of instructions to a scratch memory area and then executes the sequence of instructions. To circumvent the licensing routine, one must discern the mode of operation of the routine from the sequence of instructions written to the scratch area. But the sequence of instructions is absent from the usual print-out or dump of the program, because the sequence of instructions does not exist until execution time.
When there is a problem of locating all of the machine code in an original program, the code has been interpreted at execution time. The translation process of the interpreter is similar to an assembly language translator, but interpretation at execution time is about two orders of magnitude slower than execution of translated code, due to the fact that a multiplicity of instructions in the interpreter program must be executed to interpret each machine instruction. To permit the use of an incomplete translation, an interpreter program, a copy of the original program, and address conversion information is provided together with the translated code when porting the original program to a CPU using the new instruction architecture. When execution of the translated code reaches a point corresponding to an execution transfer in the original program to untranslated code, the CPU transfers execution instead to the interpreter program to interpret the untranslated code in the original program. The interpreter successively translates each untranslated machine instruction into one or more machine instructions for the new architecture and executes them. The interpreter uses the address conversion information to transfer execution back to the translated program at an appropriate time. The presence of untranslated code, however, has a substantial impact on performance unless almost all of the instructions can be located and translated prior to execution time.