The invention relates to computers and microprocessor architectures. Particularly, this invention relates to emulation of an instruction set architecture (ISA) using another instruction set architecture. More particularly, this invention relates to a method of and apparatus for emulating branch instructions of an instruction set architecture using instructions from another instruction set architecture.
Designing of a microprocessor architecture includes the provisioning of a set of basic instructions (typically referred to as the xe2x80x9cinstruction setxe2x80x9d) that comprises the basic building block instructions, e.g., the instructions that manipulate register contents and/or movement of data between registers. An Instruction Set Architecture (ISA) refers to the design of the instruction set of the microprocessor architecture.
A particular ISA may be better than other ISAs in some respect, e.g., providing a wider range and richer instruction set to promote easier programming, while being inferior in some other respect, e.g., requiring a complex hardware to support the more complex and greater number of instructions in the instruction set. Thus, selecting the most suitable ISA may be one of the most significant aspects of a computer architecture.
Even when a new ISA is employed to realize the benefits associated thereto (e.g., improved performance), some newer computer architectures are provided with the capability to run legacy applications that were written for a legacy ISA (which may represent a substantial capital investment). Typically this is done by emulating the instructions of the legacy instruction set with a series of one or more instructions of the native ISA. For example, as shown in FIG. 1, an instruction of the legacy instruction set 101, referred to hereinafter as xe2x80x9cmicroinstructionsxe2x80x9d, (e.g., the instruction A 103), is expanded into, or emulated by, a sequence of the native ISA instructions 102, referred to hereinafter as xe2x80x9cmicroinstructionsxe2x80x9d, (e.g., the sequence 104 (or a xe2x80x9cflowxe2x80x9d) of instructions 1-4). The microinstructions may be generated by a decoder/sequencer of the emulation hardware (not shown).
Unfortunately, an emulation of the legacy ISA instructions as described above usually results in an increased number of instructions executed and/or requires elaborate and complex additional hardware due at least in part to the differences in the semantics between instructions of the legacy ISA and the native ISA.
This is particularly true for emulation of a branch instruction. Because the branch semantics, (e.g., target prediction and/or branch conditions, etc.) between the legacy ISA and the native ISA may be significantly different, it is often not possible to map a macroinstrcution branch with a single microinstruction. Consequently, multiple microinstructions are needed to implement a macroinstrcution-branch, further increasing the number of instructions that must be executed, and thus reducing the performance of the computer system.
Moreover, the branch target prediction of a macroinstruction branch at the time the flow of microinstructions is generated by the decoder/sequencer of the emulation mechanism, is not always accurate. In particular, for example, FIG. 2 shows a number of microinstructions A, B, C, D, etc., 201 being emulated by the microinstructions 202. The macroinstruction B may be a branch instruction, e.g., branch to D. The branching may be unconditional, i.e., the macroinstruction D is executed following the retirement of the macroinstruction B, or it may be conditional, i.e., branch to D only when a given condition is met, and otherwise proceed to the next instruction C if the condition is not met. A macroinstruction branch, e.g. the branch 203, is referred to hereinafter as a macrobranch.
On the other hand, the microinstruction 7 may be a branch instruction, in which based on a condition being met, may take one of several possible flow paths. For example, if the condition is met, after the microinstruction 7 is executed, instructions 10 and 11 may be executed next before the particular flow to emulate the macroinstruction B ends. On the other hand, if the condition was not met, then the flow may continue on to instruction 8 before it ends. A microinstruction branch, e.g., the branch 204, is referred to hereinafter as a microbranch.
The decoder/sequencer of the native hardware generally is not able to generate the branch target of a macrobranch, e.g., the macrobranch 203, when the macroinstruction, e.g., instruction B, is decoded. This is because, the target prediction semantics and/or hardware of the two ISAs are very different from each other. For example, the target may be stored in a register in a microbranch instruction, and may be specified within the instruction itself for a macrobranch. Moreover, the instruction lengths of the two ISAs may be different, making it difficult for the hardware of one ISA to determine where, in the instruction of the other ISA, the branch target is specified.
Because the decoder/sequencer is unable to calculate the correct target of a macrobranch, additional hardware must be added to ensure that the decoder/sequencer does not fill the execution pipeline of the computer system with erroneous macroinstructions fetched according to the native ISA prediction mechanism, and allow the legacy ISA emulation fetch engine to calculate and branch to the correct target.
Furthermore, the differences in the ISAs make it difficult to share the same branch prediction hardware. Extra hardware would be required to ensure that predictions made during the execution of the native instruction set and during the execution of the emulated instruction set do not effect each other.
Thus, in order to emulate a branch instruction of one ISA using the instructions of the other ISA, new instructions and additional hardware must be added to handle the different semantics used by the other ISA, negatively impacting the physical layout size requirement and/or performance of the system.
In addition, it may be desirable to have a mechanism to facilitate microbranches that re-steer the execution path of a particular microinstruction flow independent from any macroinstruction branches. For instance, the instruction decoder/sequencer, while expanding a macroinstruction into a corresponding microinstruction flow, may require the ability to select a different flow path (e.g., the microinstruction 7 as shown in FIG. 2) based, e.g., on the value of a bit in a register. The instruction/decoder would need this ability in order to conserve the number of microinstructions required to implement the macroinstruction.
However, because of the differences in the remedial actions required in the event of mispredictions, and in order to ensure that the microbranches do not affect the predictions made by the macrobranch prediction mechanism, there must be additional hardware components added to distinguish between the microbranches and the macrobranches.
Moreover, performing these predicted microbranches requires the ability to, in the event of misprediction of the target, flush the execution pipeline and to redirect the sequencer/decoder to proceed through a different flow path. To this end, an event, e.g., a fault condition, may be inserted into the pipeline, which will cause the pipeline to be flushed if a target of a microinstruction branch is mispredicted (or be ignored if the target is correctly predicted).
However, due to timing delays caused by the detection of the misprediction, signaling the misprediction to the legacy ISA emulation control block, then injecting the flush event back into the pipeline, it becomes necessary to add extra xe2x80x9cpaddingxe2x80x9d instructions after the branch instruction, or otherwise provide a mechanism, to avoid execution of any subsequent instructions that should not be executed in the event of misprediction. The padding instructions reduce the performance of the system.
Furthermore, there are times when it is desirable to restart fetching anew, independent from any branching instructions. For instance, when a retiring instruction has changed instruction fetch resources, affecting the fetching of the instructions subsequent to the retiring instructionxe2x80x94referred to herein as a stale instructionxe2x80x94, the pipeline may need to be completely flushed, and filled with new instructions in light of the resource update.
Thus, there is a need for an efficient branch mechanism that does not require substantial new hardware and/or new instructions to implement macrobranches and/or microbranches.
There is also a need for an efficient branch mechanism that causes a pipeline flush immediately upon a mispredicted branch. This is needed to avoid execution of any instructions that may be present in the pipeline, without the delay in detecting and signaling the misprediction to, and waiting for a response from, the instruction fetch control mechanism.
There is also a need for an efficient branch mechanism which ensures that a branch operation for a macrobranch does not affect the flow of microinstructions, and that a microbranch does not affect the proper fetching of macroinstructions.
There is a further need for an efficient branch mechanism that does not require an addition of complex hardware for branch predictions and/or control of macrobranches.
There is also a need for a mechanism to flush any stale instructions in the instruction pipeline whenever desired, e.g., when fetch resources are changed by an earlier retired instruction.
A method and apparatus for implementing branch instructions in a computer system is described. More particularly described is a method of, and an apparatus for, implementing a branch instruction in a computer system having an execution pipeline which comprises the steps of, and means for, setting a test condition representing the correctness of a predicted target and/or whether a branch condition is met, and flushing the execution pipeline immediately upon a failure of the test condition.
Also described is a method of executing instructions in a microprocessor having an execution pipeline which comprises providing an instruction set having a conditional flush instruction that flushes the execution pipeline immediately upon a failure of a predetermined condition.