This invention relates to a method for executing a flag generating instruction and a subsequent instruction. This method is particularly useful for rapidly executing flag generating CISC instructions such as Intel x86 instructions.
When many computer languages that are in widespread use today were developed, memory was a very expensive commodity. In addition, during the development of the early microprocessors, instructions took a long time to fetch when compared with the time needed to execute. Thus, the language developers needed to ensure that instructions could be stored compactly. As a result, they developed computer languages that contained a very rich vocabulary of computer operations. A microprocessor that executes these operations is known as a complex instruction set computer (CISC). While CISC microprocessors can rapidly handle CISC instructions, such microprocessors are often more complex, expensive, and often slower than a microprocessor designed to handle simpler instructions.
Today, because memory is relatively inexpensive, modem computer languages emphasize speed of execution rather than compactness of code. Thus, more modern reduced instruction set computer (RISC) languages utilize a smaller number of simple instructions that may be rapidly handled by a RISC microprocessor. One of the primary reasons that RISC instructions may be handled faster than CISC instructions is that RISC instructions are easier to schedule and execute in parallel than CISC instructions. A microprocessor that can execute instructions in parallel is known as a superscalar microprocessor.
Most microprocessor instructions generate at least one result that is utilized by subsequent instructions. If a subsequent instruction utilizes a result produced by a previous instruction, the subsequent instruction is said to have a data dependency on the previous instruction. Thus, if a superscalar microprocessor attempts to execute the two instructions at the same time, the execution of the subsequent instruction must wait until the previous instruction has produced its result. Typically, any instruction must be delayed until all of its inputs have been produced. This requirement places a significant performance limitation on superscalar microprocessors.
The above described execution, known as out-of-order execution, is a conventional method to significantly increase the performance of modem microprocessors. It is known in the art that RISC instructions can be more easily executed out-of-order than their CISC counterparts.
In an effort to ensure compatibility with older CISC software, the vast majority of computer programmers still use CISC instructions today. However, in an effort to take advantage of the RISC performance enhancements, some microprocessor designers break down lengthy CISC instructions into simpler operations that more closely resemble RISC instructions.
When a CISC arithmetic instruction, such as ADD X1=X2+X3 executes, it generates a result equal to X2+X3. X2 and X3 are known as operands. In addition, the arithmetic instruction generates several flags. CISC logical instructions such as AND, OR, and XOR, also generate similar flags.
A flag is a bit that may indicate the condition of a microprocessor. A flag may also control the operation of the microprocessor. Flags are typically grouped together into a flag register. For example, a microprocessor could group arithmetic flags into the AFLAGS register as shown in FIG. 1(a). Descriptions of these arithmetic flags are shown in FIG. 1(b).
The arithmetic portion of the above CISC arithmetic instruction efficiently translates into a single RISC instruction or at least a relatively small number of RISC instructions. However, the flag generation portion conventionally translates into many RISC instructions. Numerous instructions are required because of the complexity of flag generation. As an example, the previously discussed ADD instruction could modify the contents of the sign, zero, carry, auxiliary carry, parity, and overflow flags. As a result of the large number of RISC instructions necessary to generate the above flags, the performance of the microprocessor decreases dramatically. Thus, translating CISC flag generating arithmetic instructions into RISC instructions is not an optimal solution.
There is a need for a more efficient method to execute flag generating arithmetic and logic instructions. Such a method should avoid the expense, complexity, and slow speed of CISC microprocessors. In addition, such a method should take advantage of RISC performance enhancements while avoiding generating a large number of RISC instructions that decrease performance.