The invention relates to generating flags. More particularly, the invention relates to methods and apparatus for emulating arithmetic status flags specified in an IA-32 ISA on an IA-64 Machine.
Many of today""s highest performance computers implement a Reduced Instruction Set Computer (RISC) Instruction Set Architecture (ISA). A vast amount of today""s software, however, is written for an IA-32ISA, a Complex Instruction Set Computer (CISC) architecture. In order to run the IA-32 software on a RISC machine, it is necessary for the RISC machine to emulate the instructions from the CISC ISA. This emulation is accomplished by converting each CISC macroinstruction into a sequence of one or more RISC microinstructions. To meet the performance requirements of emulation mode, it is necessary to convert often-executed macroinstructions into the shortest number of microinstructions as possible. An example of often-executed macroinstructions are add, subtract, and conditional branch. In the IA-32ISA, these instructions either produce or consume arithmetic flags (EFLAGS). Thus, the problem is how to efficiently emulate these IA-32 instructions on a RISC machine that does not have native support for arithmetic flag generation and consumption.
In the IA-32 architecture, certain instructions (an ADD, SUB, etc, instruction) will produce arithmetic flags that are implicitly written into bits of the IA-32 EFLAGS register. Prior art RISC processors convert each of these producing macroinstructions into a single microinstruction that requires two cycles to execute, the first cycle for producing the arithmetic result, and the second cycle for producing the arithmetic flags. Also in the IA-32 architecture, certain instructions (e.g., Jump if Condition is Met, Jcc) consume or test bits of the EFLAGS register. Prior art RISC processors have converted these consuming macroinstructions into microinstruction sequences that may include a single instruction capable of testing a bit of EFLAGS in a single cycle. Thus, these prior art processors require a total of three cycles to produce and consume EFLAGS.
Unfortunately, by producing and consuming flags in this manner, the prior art has these drawbacks:
1) Taking an extra cycle to generate flags for a basic instruction such as ADD degrades performance if a consuming instruction immediately follows the producing instruction. This is because the consuming instruction will be required to stall for one cycle to wait for the producer to generate the flags.
2) Extra complexity is required to detect the above-mentioned condition and generate the necessary pipeline stall.
3) Extra complexity is required to bypass the flag results if the consumer is following the producer by 1-3 cycles. This bypass hardware is typically a critical part of the implementation that affects overall speed and area.
4) If the IA-64 machine uses multiple ALU""s to emulate the IA-32 ISA, then each ALU may be required to have additional hardware to implicitly generate these flags.
A method consistent with the present invention generates arithmetic flags. The method includes executing an instruction via an execution unit to produce a result. Performing partial computation of one or more of the flags based on the result, once the result is obtained, and storing the partial computation of the one or more flags. A final flag is obtained upon completing the computation of the one or more flags.
An apparatus consistent with the present invention generates arithmetic flags. The apparatus includes an execution unit to execute an instruction and produce a result. The execution unit preforms partial computation on the result to produce an intermediate format of the result. The intermediate format is stored in an eflags register. A consuming instruction decodes the intermediate format and produces a final result for the flags.
Those skilled in the art will appreciate these and other advantages and benefits of various embodiments of the invention upon reading the following detailed description of a preferred embodiment with reference to the below-listed drawings.