This invention relates generally to superscalar processors, and more particularly to providing operand and result forwarding between differently sized operands in a superscalar processor.
The efficiency and performance of a processor may be measured in terms of the number of instructions that are executed per cycle. In a superscalar processor, instructions of the same or different types are executed in parallel in multiple execution units. The decoder feeds an instruction queue from which the maximum allowable number of instructions are issued per cycle to available execution units. This is called grouping of the instructions. The average number of instructions in a group, called size, is dependent upon the degree of instruction-level parallelism (ILP) that exists in a program. Data dependencies among instructions usually limit ILP and result, in some cases, in a smaller instruction group size. If two instructions are dependent, they cannot be grouped together since the result of the first (oldest) instruction is needed before the second instruction can be executed, resulting in serial execution.