In order to increase the operating speed of microprocessors, architectures have been designed and implemented that allow for the out-of-order execution of instructions within the microprocessor. An advantage of out-of-order execution of instructions is that it allows load miss latencies to be hidden while useful work is being performed. However, traditionally, load and store instructions have not been executed out of order because of the very nature of their purpose. For example, if a store instruction is scheduled to be executed in program order prior to a load instruction, but the processor executes these two instructions out of order so that the load instruction is executed prior to the store instruction, and these two instructions are referring to the same memory space, there is a likelihood that the load instruction will load incorrect, or old, data since the store instruction was not permitted to complete prior to the load instruction.
Nevertheless, techniques have been implemented to attempt to execute load and store instructions out of order. However, such techniques have often required too many processor cycles to execute. As microprocessor speeds continually increase, there is a need in the art for an ability to execute in parallel such load and store instructions and to correct for such problems as described above in a more efficient and faster manner.