The present invention relates to computer systems and more particularly to reducing the bypass network in the pipeline of a processor by providing data from a speculative register file.
Computer systems, from small handheld electronic devices to medium-sized mobile and desktop systems to large servers and workstations, are becoming increasingly pervasive in our society. Computer systems typically include one or more processors. A processor manipulates and controls the flow of data in a computer by executing instructions. Increasing the speed at which instructions are executed tends to increase the computational power of the computer. Processor designers employ many different techniques to increase processor speed to create more powerful computers for consumers. One such technique is to implement a pipeline in a processor.
A pipeline is an assembly line for instructions. When an instruction is issued to a processor pipeline, the instruction is progressively processed through separate stages in the pipeline. At any given moment, the pipeline may contain many instructions, each at different stages of processing at different stages in the pipeline.
After the processor has finished executing an instruction and has ensured that all prior instructions will also complete, the instruction is xe2x80x9cretired.xe2x80x9d This means that the result of the instruction may be stored in an architectural register file (i.e. committed to an architectural state) for later use as a source of a subsequently processed instruction. The stage at which an instruction is retired (often called a retirement or write-back stage and collectively referred to herein as a retirement stage) may be several stages beyond the stage at which the result of the instruction has been calculated by the processor (such as an execute stage).
The reason for the delay in retirement is that the register values are considered speculative until predictions or assumptions (i.e. xe2x80x9cspeculationsxe2x80x9d) made by the processor during processing of the instruction are verified to be correct. For example, an explicit prediction occurs if a processor makes a branch prediction. The processor may make this branch prediction at the front end of the pipeline, process a sequence of instructions beginning at the predicted instruction address, and resolve the branch prediction at the back end of the pipeline. All the register value results calculated by the processor during execution of the predicted sequence of instructions are considered speculative. This speculative data becomes architectural if the branch prediction is determined to have been correct, and only at that point is the architectural register file updated with the data.
If the prediction is determined to have been incorrect (i.e. mispredicted or misspeculated), then the speculative data may be erroneous. As a result, the speculative data may be flushed from the pipeline, and the processor begins executing a new sequence of instructions beginning at the correct instruction address. Other predictions may be implicit, such as implicitly predicting that no prior instructions take an exception. There are many other types of speculations that modern processors make.
The execution of one instruction in a pipeline may depend on the execution of one or more previously issued instructions. If data from a first instruction in a pipeline is needed by a second instruction in the pipeline, then the unavailability of the data from the first instruction causes a delay in the execution of the second instruction. To avoid the delay associated with updating the architectural register file with new data, and subsequently reading that data for use as source data of subsequently processed instructions, a bypass network may be implemented. A bypass network is used to pass speculative result data from later pipeline stages (i.e. closer to the retirement stage or xe2x80x9cbackendxe2x80x9d of the pipeline), to a earlier stage (such as a register read stage) of the pipeline, bypassing the architectural register file. The register read stage provides source data to datapaths of later pipeline stages for use in determining the result.
Unfortunately, due to the increasing number of instructions that can be executed in parallel in a processor, and the increasing number of pipeline stages between the register read stage and the retirement stage, the bypass network is becoming increasingly complex. Large multiplexers are required to support the bypass network. For example, at least one multiplexer is required for each source data of each instruction that could require source data in a given clock cycle in the pipeline. Each of these multiplexers includes a number of legs equal to at least the number of stages between the register read stage and the retirement stage times the number of results of each instruction that could generate a result in the given clock cycle. Consequently, the size, cost, and speed of the processor may be significantly degraded by this large and complex bypass network.
For one embodiment of the present invention, a processor comprises both a speculative register file and an architectural register file. An output of the architectural register file is coupled to an input of the speculative register file to update the speculative register file when a misspeculation is detected.
Other features and advantages of the present invention will be apparent from the accompanying figures and the detailed description that follows.