1. Technical Field
The present subject matter relates to store buffer forwarding in a pipelined computer processing system.
2. Background Information
A variety of techniques have been developed to improve the performance of microprocessor-based systems. Pipelining is one such technique that focuses on reducing latencies introduced when the processor has to wait for instructions to execute completely, one at a time. Pipelining allows processing of an instruction to be split into a series of smaller and faster execution stages. While one instruction is at one execution stage, another instruction is at another execution stage. The latency between instruction completions is thus reduced to the time duration of a single stage. But when a conditional branch instruction is encountered, a pipelined processor must predict the branch to follow and continue executing instructions along the predicted branch. If the prediction is wrong, the instructions wrongly executed must be aborted, an operation sometimes referred to as a pipeline “flush.”
Any data stored in memory would be incorrect if it was saved by an instruction within a mispredicted branch. To avoid this, pipelined processors sometimes use one or more store buffers, which may hold data stored by instructions within the pipeline, together with the memory address of the data held. The data is not forwarded to the actual memory location until the processor validates the branch as an actual branch taken. If a flush occurs, the data in the target memory location remains uncorrupted.
When a pipelined processor incorporates a store buffer, loads that follow a store to a particular memory location may need to retrieve the data from the store buffer, rather than from memory, until the store buffer forwards its contents to memory. This means that when a load takes place, the processor may need to first check if the desired data is being held within the store buffer. This can be done by comparing the address of the desired data with the address of the data held in the store buffer.
But comparing the addresses can be time consuming, particularly in computer systems that utilize virtual memory addressing. In such systems the virtual address may need to be converted to a physical address before the comparison. The conversion can introduce delays that may prevent needed data from being available when required by the processor. Processor wait states, introduced to compensate for the delay, may adversely affect system performance. Speeding up the address conversion and comparison may result in undesirable system power consumption increases.