High performance central processor unit (CPU) design requires that instruction scheduling circuitry be able to schedule and issue instructions to the execution units every cycle. In particular, when a first instruction issues, the issuing logic must be able to schedule a second instruction, depending on the first instruction for one of its source operands, immediately after the first instruction is issued.
Refer now to FIG. 1 in which is illustrated a mechanism 100 for instruction scheduling according to the prior art. Instructions are loaded in order from an external memory device (not shown) into dispatch queue 110 from which they are sent to scheduling and issuing mechanism 100, which is a portion of an execution unit (not shown) in the CPU (not shown).
Instruction operands are associated with architected register devices into which result (or target) operands are stored, and from which source operands are retrieved. Architected register devices are logical pointers associated with a physical register device via rename register device (mapper) 120, which receives instructions dispatched from dispatch queue 110.
Rename register device 120 includes an operand tag portion 130 and a W-field portion 140. Operand tags correspond to physical operand addresses and are associated with the logical source and target registers. They are sent along with the corresponding instruction to issue queue 150. Issue queue 150 includes a plurality of entries 160 including a W-field portion 162, a target tag portion 164, and a source tag portion 166. Each entry 160 also includes an opcode portion (not shown) and a control information portion (not shown). Source and target tags are received from physical tag portion 130 in rename register device 120 and loaded into target tag portion 164 and source tag portion 166, respectively. W data values stored in W-field 162 are used to determine when the source operands are available to their corresponding instruction in issue queue 150. When all source operands are available, the corresponding instruction may be issued. W-field 162 may contain a plurality of bits, each of which is associated with an instruction source operand. An instruction may be issued when all of its W-bits are active. Instruction select logic 170 selects an instruction for issuing from among all ready instructions. Instruction select logic 170 selects from among all instructions ready for issue using a select algorithm which, for example, may select the oldest ready instruction as the next instruction to issue.
When an instruction is selected and issued, its target operand tag in portion 164 is broadcast to all entries 160 in instruction queue 150. The broadcast operand tag is compared to all source operand tags in portion 166 of entries 160 by tag compare logic 172. If a source operand tag in an entry 160 corresponds with the broadcast tag, then the W-bit in portion 162 of entry 160 for the corresponding source operand is set.
Similarly, at an instruction issue, the W-bit in W-field 140 of rename register device 120 corresponding to the target operand tag of the issuing instruction is set. An instruction dispatched from dispatch queue 110 reading the corresponding location in rename register device 120 to obtain the corresponding physical tag also obtains the W-bit in the corresponding W-field 140 which is then in-gated into issue queue 150 in portion 162 along with the source operand tag in portion 164. In this way, the dispatched instruction is informed that the corresponding source operand is already available.
A dispatching instruction that is being in-gated into issue queue 150 as an instruction is issued from issue queue 150, uses the broadcast tag described hereinabove to set its W-bits. Tag compare logic 174 compares the broadcast operand tag from the issuing instruction with the source operand tag of the dispatching instruction that is being in-gated into issue queue 150. If a match occurs, the W-bit corresponding to the source operand matching the broadcast operand tag is set as the W-bit is in-gated to issue queue 150.
As the speed of CPUs increase, and the cycle time correspondingly becomes shorter, the task of setting instruction W-bits to schedule dependent instructions becomes prohibitive, in scheduling issuing mechanism 100 according to the prior art. If scheduling/issuing mechanism 100 cannot resolve the instruction dependencies in a cycle time, then dependent instructions cannot be issued in a pipeline fashion. Thus, there is a need in the art for a self-initiated issuing mechanism that permits the pipelined issuing of dependent instructions in a high speed CPU.