1. Technical Field
The present invention relates in general to a method and system for data processing and, in particular, to a method and apparatus for computing condition code bits. Still more particularly, the present invention relates to a method and apparatus for computing less than (LT), greater than (GT), and equal to (EQ) condition code bits concurrent with the execution of an instruction.
2. Description of the Related Art
A state-of-the-art superscalar processor can be comprised of, for example, an instruction cache for storing instructions, an instruction buffer for temporarily storing instructions fetched from the instruction cache for execution, one or more execution units for executing sequential instructions, a branch processing unit (BPU) for executing branch instructions, a dispatch unit for dispatching sequential instructions from the instruction buffer to particular execution units, and a completion buffer for temporarily storing sequential instructions that have finished execution, but have not completed.
Branch instructions executed by the branch processing unit (BPU) of the superscalar processor can be classified as either conditional or unconditional branch instructions. Unconditional branch instructions are branch instructions that change the flow of program execution from a sequential execution path to a specified target execution path and that do not depend upon a condition supplied by the occurrence of an event. Thus, the branch specified by an unconditional branch instruction is always taken. In contrast, conditional branch instructions are branch instructions for which the indicated branch in program flow may be taken or not taken depending upon a condition within the processor, for example, the state of specified condition register bits or the value of a counter. Conditional branch instructions can be further classified as either resolved or unresolved, based upon whether or not the condition upon which the branch depends is available when the conditional branch instruction is evaluated by the branch processing unit (BPU). Because the condition upon which a resolved conditional branch instruction depends is known prior to execution, resolved conditional branch instructions can typically be executed and instructions within the target execution path fetched with little or no delay in the execution of sequential instructions. Thus, it is advantageous to determine condition register bits or another condition upon which a conditional branch instruction may depend as quickly as possible so that the conditional branch instruction can be resolved prior to execution. Even if a conditional branch instruction is not resolved prior to its execution, meaning that the conditional branch is speculatively predicted, it is still advantageous to compute the condition upon which the branch instruction depends as quickly as possible because the performance penalty incurred in the event of misprediction is thereby minimized.
Condition register bits upon which conditional branch instructions may depend are set in response to predetermined architecturally defined instructions, for example, compare instructions and certain "recording" forms of add, subtract, and other arithmetic and logical instructions. The condition register bits set by compare instructions and recording instructions include a less than (LT) bit, a greater than (GT) bit, and an equal to (EQ) bit, which indicate whether the result of a particular instruction is less than, greater than, or equal to zero, respectively. Conventional processors first determine the result of an instruction (e.g., add) and then compare the result with zero in subsequent cycle(s) to produce the condition register bits. As will be appreciated, this serial architecture places an inherent limitation upon how early the condition register bits can be determined. More recently, various techniques have been employed in order to determine the value of the EQ bit in parallel with the execution of certain types of instructions. Although the early determination of the value of the EQ bit provides some performance advantages over the prior serial approach, there remains a need in the art for a method and apparatus for computing the value of all of the LT, GT, and EQ bits concurrent with arithmetic and logical operations in order to enhance branch processing performance.