In the continuing development of faster and more powerful computer systems, a significant microprocessor has been utilized, known as a reduced instruction set computer (RISC) processor. Increased advances in the field of RISC processors have led to the development of superscalar processors. Superscalar processors, as their name implies, perform functions not commonly found in traditional scalar microprocessors. Included in these functions is the ability to execute instructions out-of-order with respect to the program order. Although the instructions occur out-of-order, the results of the executions appear to have occurred in program order, so that proper data coherency is maintained.
A common bottleneck in superscalar processor performance is the number of instructions which can be outstanding within the processor at a given time. Typically, the instruction unit includes a queue which indicates the number of outstanding instructions. The queue typically suspends any future dispatching of instructions if a maximum number is reached.
One type of instruction which can be slow to complete is the store instruction. A store instruction is slow to complete for a number of reasons. For example, store instructions are slow to complete due to the maximum number of stores which can be completed per cycle, and due to the number of stores which can update the cache each cycle. Conventional superscalar processors typically only complete one store instruction per cycle. This oftentimes causes dispatch stalls. Accordingly, a need exists for a system that efficiently and effectively combats such problems and decreases the number of dispatch unit stalls due to the lack of store instructions completions to enhance overall processor performance.