The present invention relates to the field of computer processor architecture. In particular the present invention discloses a method and apparatus for scheduling computer instructions.
Early computer processors executed computer instructions one at a time in the original program order. Specifically, each computer instruction is loaded into the processor and then executed. After execution, the results of the computer instruction are then written into a register or into main memory. After the execution of a computer instruction, the next sequential computer instruction is then loaded into the processor and executed.
To improve performance, pipelined computer processors were introduced. Pipelined computer processors process multiple computer instructions simultaneously. However, early pipelined computer processors execute the instructions in the original program order. Pipelined processors operate by dividing the processing of instructions into a series of pipeline stages such as instruction fetch, instruction decode, execution, and result write-back. The processor is then divided into a set of linked pipeline stages that each perform one of the instruction processing pipeline stages. In the previously described example, the processor would be divided into an instruction fetch stage, an instruction decode stage, an execution stage, and write-back stage. During each clock cycle, each processing stage processes an instruction and then passes it to the next sequential processing stage. Thus, the processor is processing several instructions simultaneously in the original program order. In an ideal single pipeline processor, the processor will complete the execution of an instruction during every clock cycle.
To further improve processor performance, superscalar processors have been introduced. Superscalar processors process more than one instruction at a time using parallel pipeline stages. By executing instructions in parallel, superscalar processors take advantage of the parallelism that exists in the instructions. Parallelism exists when sequential computer instructions are not dependent upon each other for source operands. These non dependent sequential instructions can be executed in parallel without any data conflicts.
One of the difficult aspects of designing superscalar processors is to find and schedule instructions in parallel such that there are no data dependency violations and sufficient processor resources are available.
According to one embodiment, a method of scheduling instructions in a computer processor is provided. The method comprises fetching instructions to create an in-order instruction buffer, and scheduling instructions from the in-order instruction buffer into instruction slots within instruction vectors in an instruction vector table. Instruction vectors are then dispatched from the instruction vector table to a prescheduled instruction cache, and, in parallel, to an instruction issue unit.