This invention relates generally to intermediate languages, and more particularly to inferring operand types within such intermediate languages.
Intermediate language-type models for programming languages have become increasingly popular. In an intermediate language model, a source code is generally compiled into a desirably substantially platform-independent intermediate language. When the code is desired to be run on a particular platform, an execution engine on that platform then interprets or compiles the intermediate language to native code understandable by the platform. Examples of systems that use intermediate languages include the Java virtual machine.
As an example of native code, it is noted that processors such as x86-type processors known in the art generally require a separate instruction for an operation as the operation is applied to each different data type. For example, an xe2x80x9caddxe2x80x9d operation will usually have a separate xe2x80x9cadd integerxe2x80x9d instruction for integer data types, and an xe2x80x9cadd realxe2x80x9d instruction for real (i.e., floating point) data types. Other data types for which separate instructions may be required include xe2x80x9cshortxe2x80x9d integers, xe2x80x9clongxe2x80x9d integers, etc.
Execution speed and code size are important considerations in the usability of code written in any programming language in general, and in an intermediate language in particular. These two metrics in general are at odds with one another. For example, because intermediate language code generally is compiled to native code as it is being executed, its execution speed is usually slower than comparable programs already pre-compiled into native code. On the other hand, intermediate language code is also more likely to be stored in small consumer electronics device and more likely meant for transmission over the Internet as compared with more traditional computer programs, rendering code size a more important metric in the intermediate language code""s usability than compared with more native computer programs, which are usually stored, for example, on voluminous CD-ROM""s and hard disk drives.
Like most computer languages, intermediate languages have instruction sets, with each instruction having a corresponding xe2x80x9copcodexe2x80x9d that identifies the instruction. For expressiveness and ease of programming, a large number of instructions, and therefore corresponding opcodes, is desirable. However, having more than 256 instructions or opcodes means that more than one byte is necessary to identify each instruction. If more than 256 instructions are desired, then another byte must be added to uniquely identify each opcode.
Having more than 256 instructions, however, while allowing for a richer instruction set, is disadvantageous when execution speed and code size are considered. Having each instruction take up two bytes instead of one byte increases the size of the resulting code. Furthermore, the extra size implies more memory accesses (page faults) and thus generally takes longer than processing instructions that are shorter in length. There is a need, therefore, for a robust instruction set that nevertheless provides for the execution speed and code size advantages that one-byte opcodes provide. For these and other reasons, there is a need for the present invention.
The invention relates to inferring operand types within an intermediate language. In one embodiment, a computer-implemented method first inputs an intermediate language code that includes type-indefinite opcodes. The method transforms the input code into a second stream of opcodes, where the types of each type-indefinite opcode has been inferred contextually. The method finally generates native code from the type opcode stream.
In one embodiment, for example, a program already in intermediate language code may have an xe2x80x9caddxe2x80x9d instruction, to add two numbers like 4 and 5, or 4.5 and 5.5. In the former case, both of these numbers are integers, while in the latter case, both are real numbers. Therefore, one embodiment of the invention would note that in the first case the add instruction is adding two integers, and would resolve that instruction to a specific xe2x80x9cadd integerxe2x80x9d instructionxe2x80x94while in the latter case, the embodiment would note that the add instruction is adding two real numbers, and would resolve the instruction to a specific xe2x80x9cadd realxe2x80x9d instruction, which is a different instruction than the xe2x80x9cadd integerxe2x80x9d instruction. When generating native code, which is the specific code that is executed by a computer""s processor, for example, the method would thus generate a corresponding xe2x80x9cadd integerxe2x80x9d native opcode for the typed xe2x80x9cadd integerxe2x80x9d instruction, and a corresponding xe2x80x9cadd realxe2x80x9d native opcode for the typed xe2x80x9cadd realxe2x80x9d instruction.
The invention provides for advantages not found in the prior art. For example, the invention allows for a robust instruction set that still has a relatively small total number of opcodes. For example, rather than having an xe2x80x9caddxe2x80x9d instruction for each type of operandxe2x80x94e.g., an xe2x80x9cadd floating point (real)xe2x80x9d instruction, an xe2x80x9cadd short (integer)xe2x80x9d instruction, an xe2x80x9cadd long (integer)xe2x80x9d instruction, an xe2x80x9cadd integerxe2x80x9d instruction, etc.xe2x80x94an embodiment of the invention instead only needs a single xe2x80x9caddxe2x80x9d instruction, since the specific type of this instruction is later resolved by the invention. An instruction set can therefore still be robust, while nevertheless using only a single byte to identify each opcodexe2x80x94thus ensuring the speed and size advantages that result from using a single byte.
The invention includes computer-implemented methods, machine-readable media, computerized systems, devices and computers of varying scopes. Other aspects, embodiments and advantages of the invention, beyond those described here, will become apparent by reading the detailed description and with reference to the drawings.