The present invention relates in general to the field of computing. More specifically, the present invention relates to systems and methodologies for improving the performance of a processor's handling of internal registers.
An Out of Order (“OoO”) processor typically contains multiple execution pipelines that may opportunistically execute instructions in a different order than what the program sequence (or “program order”) specifies in order to maximize the average instruction per cycle rate by reducing data dependencies and maximizing utilization of the execution pipelines allocated for various instruction types. Results of instruction execution are typically held temporarily in the physical registers of one or more register files of limited depth. Registers are quickly accessible locations internal to a central processing unit. In a processor that uses registers, data is loaded from a larger memory, such as a main memory or a cache, where it is used for arithmetic operations or other manipulations by machine instructions. An OoO processor typically employs register renaming to avoid unnecessary serialization of instructions due to the reuse of a given architected register by subsequent instructions in the program order.
Under register renaming operations, each architected (i.e., logical) register targeted by an instruction is mapped to a unique physical register in a register file. In current high-performance OoO processors, a unified main mapper is utilized to manage the physical registers within multiple register files while the instructions that write them are being processed. A secondary architected mapper manages the physical registers after they have completed. In addition to storing the logical-to-physical register translation (i.e., in mapper entries), the unified main mapper is also responsible for storing dependency data (i.e., queue position data), which is important for instruction ordering upon completion.
In a unified main mapper-based renaming scheme, it is desirable to free mapper entries as soon as possible for reuse by the OoO processor. However, in the prior art, a unified main mapper entry cannot be freed until the instruction that writes to a register mapped by the mapper entry is completed. This constraint is enforced because, until completion, there is a possibility that an instruction that has “finished” (i.e., the particular execution unit (EU) has successfully executed the instruction) will still be flushed before the instruction can “complete” and before the architected, coherent state of the registers is updated. Because of the small number of registers available to a processor, an efficient way of managing the use of registers is desirable.