In order to increase the operating speed of microprocessors, architectures have been designed and implemented that allow for the out-of-order execution of instructions within the microprocessor. However, traditionally, load and store instructions have not been executed out of order because of the difficulty in ensuring that data dependencies are met. For example, if a store instruction is scheduled to be executed in program order prior to a load instruction, but the processor executes these two instructions out of order so that the load instruction is executed prior to the store instruction, and these two instructions are referring to the same memory space, there is a likelihood that the load instruction will load incorrect, or old, data since the store instruction was not permitted to complete prior to the load instruction.
Furthermore, even if such store and load instructions are permitted to execute out of order, a store operation may still be stalled waiting for necessary data to become available. Therefore, there is a need in the art to improve the performance of executing store instructions in a processor.