Migration to new microprocessor architectures requires emulation of foreign instruction sets to preserve existing software investment. For example, many personal computers (PCs) have been based on the so-called "x86" instruction set which began with the 8086 microprocessor developed by the Intel Corporation of Santa Clara, Calif. Because of the large number of x86-based PCs purchased, much software was written with the x86 instruction set. However it would be desirable to execute this existing software in computers based on other types of microprocessors.
Two known approaches to solving this problem are translation and interpretive execution. Translation converts opcodes from the foreign architecture into a sequence of native operations. Translation may be static or dynamic. Static translation is limited to the information that can be gathered statically about the program, and thus dynamically loaded libraries and self-modifying programs are a problem to the static translation approach. Dynamic translation generates native code "on the fly" while the program is executing. The key phases of instruction set emulation are the dispatch phase (corresponding to the fetch and decode phases of a microprocessor) and the execute phase. The dispatch phase creates most of the overhead associated with the emulation process. Overhead in the execute phase comes from architectural mismatch between the foreign and the native architectures.
Interpretive execution uses a fast dispatch mechanism to decode the foreign instruction set opcode and execute a native routine to perform the equivalent function. Interpretation keys off the guest instruction's opcode to dispatch the semantic routine for each guest instruction by decoding the guest instruction's opcode. A state-machine-like mechanism is used to control the (possibly multi-byte) decoding. Translation maps a sequence of guest instructions into native code. During execution, the simulated guest program counter is mapped to the native program counter of the location containing the translated instruction sequence. The mapping is usually performed with a hash table.
Interpretive execution suffers the overhead of decoding (repeatedly) each instruction as it is encountered. Translation avoids this overhead, because the instruction is decoded once, at translation, and possibly executed several times. Furthermore, optimization is performed at translation time, thus resulting in more efficient code. However, translation incurs code size expansion overhead.
A better solution is to combine interpretive execution with translation. This combined approach uses interpretive execution for low-frequency instructions and translates natively those instruction sequences that take up most of the execution time. The combined approach achieves the low-overhead in code size while allowing for the speed improvements of translation. The key problem with the combined approach is the transition between interpretive and translated execution: the interpreter must recognize when it reaches the first instruction of a block that has been translated into native code. The usual solution is to introduce a new opcode that triggers this transition. Introducing a new opcode requires a change to an executable program, which entails problems for shared executables and requires operating system support. Furthermore, the combined approach adds overhead for interpreting the new opcode.
Another solution is to use "turbo" sequences of machine idioms. Machine idioms are sequences of guest machine instructions that occur frequently. An example of such a sequence is a tight loop that polls the status of an input/output (I/O) port. Such idioms have strict timing requirements that cannot be met by regular interpretation. The turbo sequence recognition approach expands the interpreter's decoding routine to recognize an extended instruction set that includes the idioms. However turbo sequence recognition is restricted to the idiom set that is known in advance. Idiom sequences which depend on program execution cannot be recognized in advance.
What is needed then is a method for improving on these known techniques of emulation so that new microprocessors may be able to run existing software. The present invention provides such a method and a corresponding data processor, whose features and advantages will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings.