Field of the Invention
The present invention relates in general to microprocessor instruction translation and processing, and more particularly to a compressing instruction queue that improves the process of queuing microinstructions translated from architectural instructions to more efficiently provide the queued microinstructions for subsequent processing.
Description of the Related Art
Many microprocessors incorporate at least one instruction translator that translates architectural instructions into the microinstructions that can be processed by the execution units of the microprocessor. Architectural instructions are the instructions of an instruction set architecture (ISA) of the microprocessor. Examples of architectural instructions include instructions according to the x86 ISA by Intel or the Advanced RISC Machines (ARM®) ISA or the like. In some configurations, the translator converts each architectural instruction into one or more microinstructions. Modern processors are typically superscalar in which multiple instructions are processed simultaneously in each clock cycle. Superscalar microprocessors often include multiple translators in parallel that collectively process multiple architectural instructions during each clock cycle.
Although each architectural instruction may be independently transformed by each of multiple translators into separate microinstructions or microinstruction groups, many processors further perform fusion in which the architectural instructions are combined and microinstructions and/or source information and the like are shared. As a result, the collective set of translators may output any one of many possible combinations of microinstructions in any given clock cycle. Under certain conditions the translators may output zero microinstructions in some clock cycles or may produce up to a predetermined maximum number of microinstructions in other clock cycles. The varied and somewhat unpredictable translated microinstruction combinations have caused processing inefficiencies.
The translated microinstructions are eventually provided to pre-execution logic, such as a register alias table (RAT) or the like. The RAT generates dependency information for each microinstruction based on its program order, based on the operand sources it specifies, and also based on renaming information. The RAT, however, is configured to receive up to a limited number of microinstructions at a time. In one embodiment, for example, the RAT is able to receive and process up to three microinstructions in each clock cycle. In prior configurations, a spill buffer or queue or the like was provided before or at the front end of the RAT to decouple the translators from the RAT. The spill buffer was organized into rows in which each row normally held up to the number of microinstructions that could be handled by the RAT in each clock cycle. Thus, for example, if the RAT could process up to three microinstructions, then each row of the spill buffer included three entries.
The variation of translator output and the fixed RAT input caused unused buffer storage and inefficient operation. Any additional microinstructions output from the translators that could not be stored in the current row of the spill buffer or that could not be readily consumed by the RAT in the current clock cycle were simply discarded. The unprocessed architectural instructions were shifted and translated again in the next clock cycle. The re-translation of architectural instructions wasted valuable processing cycles and consumed additional power. Furthermore, prior configurations did not store new microinstructions into a partially filled row of the spill buffer, leaving one or more empty and unused storage locations in the row. An empty storage location means that the storage location does not store a valid microinstruction. One or more empty storage locations left holes in the spill buffer. Such holes or unused storage locations within the spill buffer led to inefficient processing (e.g., such as by the RAT or other pre-execution logic).