1. Field of the Invention
This invention relates generally to the processing of program instructions in a microprocessor, and more particularly, to expanding program instructions.
2. Description of the Related Art
Data structures, such as register files or queue structures store data for use in a digital system. Present microprocessors require multi-ported queue structures to allow more than one data entry to be written into the queue during a single clock cycle. Due to the data requirements, each port of the queue structure is wide (100+bits). As the number of ports increases the area occupied by the queue structure also increases. Due to the increased size a queue structure with a large number of ports may also encounter speed problems. Typically there is a trade off between the performance of the microprocessor (based on the number of ports) and the size of the queue structure.
Present microprocessors are capable of executing instructions out of order (OOO). Instructions are decoded in program order and stored into a queue structure. The instructions are read out of the queue structure by the OOO portion of the microprocessor. The OOO portion renames the instructions and executes them in an order based on the available resources of the microprocessor and the interdependency relationships between the various instructions. The queue structure represents the boundary between the in order portion of the microprocessor and the OOO portion.
One type of instruction executed out of order is a load instruction. Load instructions require that data be read from a storage device such as a register, cache memory, main memory, or external data storage device (e.g., hard drive). In order to hide the latency of load instructions (i.e., the time required to locate and load the requested data), it is desirable to execute the load instruction as soon as possible.
Referring to FIG. 1A, a program sequence of a computer program as seen by the in order portion of the microprocessor is shown. The program sequence includes instructions A, B, and C, a store instruction 100, a load instruction 105, and instructions D, E, and F. If the load instruction 105 is not dependent on instructions A, B, or C, the OOO portion can schedule the load instruction 105 ahead of any or all of the other instructions. The early execution hides the latency of the load instruction, such that the microprocessor can complete the load before it actually needs the data (e.g., in instructions D, E, or F) and will not have to stall while the load is completing.
The early execution of the load instruction 105 is effective, as long as there is no conflict between the store address of the store instruction 100 and the load address of the load instruction 105. If there is a conflict, then the load instruction 105 has loaded incorrect data. To address such a conflict, the load instruction 105 is expanded into a speculative load instruction 110 and an architectural load instruction 115, as represented by the program sequence of FIG. 1B.
The speculative load instruction 110 is free of any dependency restrictions and can be scheduled by the OOO portion at any time. Conversely, the architectural load instruction 115 is always executed in program order. As a result, conflicts are identified when the architectural load instruction 115 is executed and the load can be reissued to retrieve the correct data, and the instructions following the load can be reissued.
When a load instruction 105 is decoded, both the speculative load instruction 110 and the architectural load instruction 115 are entered into the queue structure. Accordingly, two ports must be used for each load instruction. Assuming the queue structure has 5 ports, instructions A, B, C, the store instruction 100, and the load instruction 105, cannot be loaded during the same clock cycle due the expansion of the load instruction 105. To increase the performance of the queue structure an additional port would be required, thus increasing the area of the queue structure and introducing the potential for speed problems due to the increase in the number of wires and the length of the wires.
The present invention is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.