1. Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to instruction translation mechanisms within microprocessors.
2. Description of the Relevant Art
Computer systems employ one or more microprocessors, and often employ digital signal processors (DSPs). The DSPs are typically included within multimedia devices such as sound cards, speech recognition cards, video capture cards, etc. The DSPs function as coprocessors, performing complex and repetitive mathematical computations demanded by multimedia devices and other signal processing applications more efficiently than general purpose microprocessors. Microprocessors are typically optimized for performing integer operations upon values stored within a main memory of a computer system. While DSPs perform many of the multimedia functions, the microprocessor manages the operation of the computer system.
Digital signal processors include execution units which comprise one or more arithmetic/logic units (ALUs) coupled to hardware multipliers which implement complex mathematical algorithms in a pipelined manner. The instruction set primarily comprises DSP-type instructions (i.e. instructions optimized for the performance of complex mathematical operations) and also includes a small number of non-DSP instructions. The non-DSP instructions are in many ways similar to instructions executed by microprocessors, and are necessary for allowing the DSP to function independent of the microprocessor.
The DSP is typically optimized for mathematical algorithms such as correlation, convolution, finite impulse response (FIR) filters, infinite impulse response (IIR) filters, Fast Fourier Transforms (FFTs), matrix computations, and inner products, among other operations. Implementations of these mathematical algorithms generally comprise long sequences of systematic arithmetic/multiplicative operations. These operations are interrupted on various occasions by decision-type commands. In general, the DSP sequences are a repetition of a very small set of instructions that are executed 70% to 90% of the time. The remaining 10% to 30% of the instructions are primarily boolean/decision operations. An exemplary DSP is the ADSP 2171 available from Analog Devices, Inc. of Norwood, Mass.
As used herein, the term xe2x80x9cinstruction setxe2x80x9d refers to a plurality of instructions defined by a particular microprocessor or digital signal processor architecture. The instructions are differentiated from one another via particular encodings of the bits used to form the instructions. In other words, each instruction within the instruction set may be uniquely identified from other instructions within the instruction set via the particular encoding. A pair of instructions from different instruction sets may have the same encoding of bits, even if the instructions specify dissimilar operations. Additionally, instruction sets may specify different encoding schemes. For example, one instruction set may specify that the operation code (or opcode), which uniquely identifies the instruction within the instruction set, be placed in the most significant bit positions of the instruction. Another instruction set may specify that the opcode be embedded within the instructions. Still further, the number and size of available registers and other operands may vary from instruction set to instruction set.
An instruction sequence comprising a plurality of instructions coded in a particular order is referred to herein as a code sequence. A code sequence which represents a larger function (such as a code sequence which, when executed, performs a fast Fourier transform) is referred to as a routine.
Unfortunately, many routines which perform complex mathematical operations are coded in the x86 instruction set. Such mathematical routines often may be more efficiently performed by a DSP. Microprocessors often execute instructions from the x86 instruction set, due to its widespread acceptance in the computer industry. This widespread acceptance also explains why many complex mathematical routines may be coded in the x86 instruction set. Conversely, DSPs develop instruction sets which are optimized for mathematical operations common to signal processing. Because the DSP instruction set is optimized for performing mathematical operations, it is desirable to determine that a routine may be more efficiently executed in a DSP and to route such a routine to a DSP for execution.
The problems outlined above are in large part solved by a microprocessor in accordance with the present invention. The microprocessor includes an instruction translation unit and a storage control unit. The instruction translation unit scans the instructions to be executed by the microprocessor. The instructions are coded in the instruction set of a CPU core included within the microprocessor. The instruction translation unit detects code sequences which may be more efficiently executed in a DSP core included within the microprocessor, and translates detected code sequences into one or more DSP instructions. Advantageously, the microprocessor may execute the code sequences more efficiently. Performance of the microprocessor upon computer programs including the code sequences may be increased due to the efficient code execution.
The instruction translation unit conveys the translated code sequences to a storage control unit. The storage control unit stores the code sequences along with the address of the original code sequences. As instructions are fetched, the storage control unit is searched. If a translated code sequence is stored for the instructions being fetched, the translated code sequence is substituted for the code sequence. Advantageously, a code sequence may be translated once and the stored translation used upon subsequent fetch of the code sequence. Particularly in cases where the instruction translation mechanism occupies numerous clock cycles, performance of the microprocessor may be increased. A large portion of the computer program may be scanned, or the translation cycles may be bypassed in the instruction processing pipeline, depending upon the embodiment.
Broadly speaking, the present invention contemplates a microprocessor comprising an instruction translation circuit and a storage control unit. The instruction translation circuit is configured to translate a first plurality of instructions coded in a first instruction set into at least one instruction coded in a second instruction set. Coupled to receive the instruction from the second instruction set, the storage control unit is configured to cause storage of the instruction such that, upon execution of a code sequence including the first plurality of instructions, the instruction is substituted for the first plurality of instructions.
The present invention further contemplates a method of executing instructions in a microprocessor. A first plurality of instructions from a first instruction set is translated into at least one instruction from a second instruction set. The first plurality of instructions define an operation which is efficiently performed via execution in the second instruction set. A code sequence including the instruction and a second plurality of instructions coded in the first instruction set is executed in a first execution core and a second execution core within the microprocessor. The first execution core is configured to execute instructions from the first instruction set and the second execution core is configured to execute instructions from the second instruction set. The first execution core thereby executes the second plurality of instructions and the second execution core thereby executes the instruction from the second instruction set. The instruction from the second instruction set is stored via a storage control unit within the microprocessor, such that the instruction is executed in lieu of the first plurality of instructions upon execution of the code sequence.