The present invention generally relates to a super-scaler microprocessor of the type which executes more than one instruction during each operating cycle of the microprocessor. The present invention more particularly relates to an instruction decoder for use in such a super-scalar microprocessor which is configured for decoding and dispatching a plurality of instructions to functional units for execution during each microprocessor operating cycle.
Scalar microprocessors are well known in the art. Such processors generally employ a pipelined architecture for obtaining and executing instructions in sequential stages including a fetch stage, a decode stage, an execute stage, and a write-back stage. When such a microprocessor continuously executes on instructions in consecutive operating cycles, during the fetch stage, a first instruction is fetched from a source of instructions, such as, for example, either an internal instruction cache or an external instruction memory. During the decode stage of the first instruction, the first instruction is decoded for, as an example, required read operands and the type of execution required by the instruction. During the decode stage of the first instruction, a second instruction enters the fetch stage and is fetched. During the execution stage of the first instruction, the first instruction is executed by the required functional unit while the second instruction advances to its decode stage and a third instruction enters its fetch stage. Lastly, during the write-back stage of the first instruction, the functional unit which executed the first instruction writes-back to memory the result of the execution. During the write-back stage of the first instruction, the second instruction advances to its execution stage, the third instruction advances to its decode stage, and a fourth instruction enters its fetch stage. As can thus be seen from the foregoing, a processor of this type can operate on as many as four instructions at a time and, given no traps or exception conditions, can execute at a rate up to one instruction execution per microprocessor operating cycle. Also, by virtue of the pipelined architecture, such processors are capable of performing a great number of complex operations per unit of time with a minimum of hardware.
Although such scalar processors have been commercially successful and proven reliable and capable of performance speeds suitable for many different applications, the concept of utilizing such a pipelined architecture in a processor for executing more than one instruction during each operating cycle has recently been introduced. In order to support the execution of a plurality of instructions during each operating cycle, such processors, known in the art as super-scalar processors, must be capable of fetching and decoding and dispatching to the functional units a plurality of instructions during each operating cycle. These functions must be carefully coordinated.
For example, program instruction order must be maintained to assure that instructions are dispatched to the functional units in the predicted order of execution. Dispatching instructions that are not in the predicted execution path would result in execution of instructions not intended to be executed. Also, if required operands are stored in a register file or a buffer, it must be assured that such operands are available for the required functional unit and that the required functional unit itself is available before the corresponding instruction is dispatched. Further, if one or more instructions of a previously fetched plurality of instructions cannot be dispatched, it is necessary to once again provide those instructions for decoding. This requires coordination between the instruction decoder and the source of instructions, such as, for example, an instruction cache. Still further, nonsequential branch instructions must also be accommodated. All of the foregoing places upon a super-scalar microprocessor extreme coordination requirements for the simultaneous fetching, decoding and dispatching, execution and write-back of multiple instructions at each stage.
The microprocessor of the present invention provides such coordination. It includes an instruction decoder embodying the present invention which maintains account of the availability of required operands, the availability of functional units, and the program instruction order. In addition, the interface between the instruction decoder and the instruction cache is elegant in its simplicity and operates in accordance with the same predefined protocol whether a plurality of instructions are newly presented to the instruction decoder, whether a previous plurality of instructions are once again presented to the instruction decoder, or whether a plurality of instructions presented to the instruction decoder contain a nonsequential branch instruction.