1. Field of the Invention
The present invention relates to scheduling instructions for computer processing, and more particularly, to a scheduling scheme and mechanism to process instructions through a functional unit in a processor.
2. Description of the Related Arts
A conventional reservation station is a mechanism in which operational entries of a data flow wait for their dependent component parts before being scheduled and issued for execution in a functional unit of a processor. The dependent component parts are operands that must be provided in order for the operational entries to continue on with processing. A functional unit may be, for example, an arithmetic logic unit or a multiplier unit. A benefit of this design is that data is only distributed once.
Conventional reservation stations use one of two approaches to schedule a data flow operation through a reservation station. The first approach is a distributed reservation station approach. In this approach, a multitude of reservation stations are distributed throughout the processor. Each reservation station is typically connected with one or two functional units. As each operation in the distributed reservation station receives its full compliments of operands, it is scheduled for execution by a respective functional unit.
However, a significant timing problem is introduced with this first approach when scheduling back-to-back dependent operations, each of which are carried out within a single clock cycle. Specifically, the following operations must be completed within a single clock cycle: (1) selection of an operation to be scheduled, (2) read a result tag that is associated with the operation to be scheduled, (3) distribute the result tag to other reservation stations, (4) match the distributed result tag across all operands having a non-ready status in a local reservation station, and (5) update the status of all operands from a non-ready state to a ready state.
With the conventional distributed reservation station approach, as clock rates increase, it becomes increasingly difficult to make the timing of the above five steps. One reason for this is that the third step, distribution of the result tag to other reservation stations, takes a significant amount of time, particularly in systems in which the reservations stations are not proximally located within the processor. Hence, the distributed reservation station approach can cause significant processing delays as additional clock cycles are needed to schedule and issue back-to-back single clock cycle operational entries.
The second approach for scheduling a data flow through a reservation station uses a centralized reservation station. In this approach, the distributed reservation stations are eliminated in favor of a single, central reservation station. An advantage of the central reservation station is that the time-consuming step for distributing the result tag to other reservation stations is effectively eliminated.
However, a problem with this second approach is that the operand data must be distributed both to and from the reservation station as the operations are scheduled. The flow of operand data to and from the central reservation station increases the number of data buses in the processor more than three-fold because each dyadic functional unit requires two source buses and one result bus rather than the single result bus used in the distributed reservation station approach. The additional bus architecture and structure increases chip complexity, implementation, costs, and the amount of space needed on an integrated circuit chip.
Therefore, there is a need for a reservation station system that provides timely scheduling and issuance of instructions without requiring additional bus architecture and structure within a processing system.