The present invention relates to a predicate register file. More particularly, the present invention relates to scoreboarding and renaming such a predicate register file.
Microprocessors often use instruction pipelining to increase instruction throughput. An instruction pipeline processes several instructions through different stages of instruction execution concurrently, using an assembly line-type approach. The pipeline stages are often controlled by predicate registers. One predicate register may be assigned to each stage in the pipeline. All instructions for that stage may then share the same predicate register. Thus, this enables determination of whether the instructions for the stage are executed or not. In other applications, such as Itanium applications, each instruction (referred to as “syllable”) has its own “qualifying predicate” that determines whether it executes or not.
However, the performance of pipelined computers may be degraded by data dependencies. A data dependency exists between two instructions if the execution of one depends upon the results of executing the other. Each instruction has the potential to stall later instructions that depend on it.
Accordingly, each of the predicate registers may be associated with a bit which indicates whether the data inside each respective register is either updated and ready to be used, or is being modified or produced and therefore not available. This bit is often referred to as a “scoreboard” bit. For example, if a scoreboard bit for a particular predicate register is set, then the next instruction that needs to access this register cannot execute until the scoreboard bit for this register has been cleared. To clear this register bit, a preceding operation needs to complete execution.
Out-of-order execution may also be used to substantially reduce the effect of stalls due to data dependencies. Upon encountering an instruction that depends on data still in use, the out-of-order execution processor checks for later independent instructions in the program and executes these later instructions before the instruction with dependent data. This reduces the impact of execution stalls because the execution of later independent instructions is overlapped with the execution of instructions requiring multiple clocks to complete.