This invention relates generally to improving computer system efficiency, and more particularly to the reduction of operand store compare rejects penalties through instruction text based early detection.
As computer system designers seek to continually improve processor performance, it is beneficial to develop approaches that reduce cycles per instruction (CPI). Operand store compare (OSC) penalties can be a large contributor to high CPI numbers. OSC is defined as encountering an instruction with an operand to store data to memory followed by an instruction to load the data from the memory before the stored data actually reaches the memory. As a stream of instructions progresses through a processor pipeline, various control units perform tasks such as fetching instructions, dispatching instructions, calculating address values, accessing registers, fetching operands, executing instructions, checking for error conditions, and retiring the instructions including storing the results. When instructions advance deeper through the pipeline, dependency conditions, errors, incorrectly predicted branches, and the like, can stall progress of the instructions through the pipeline as the conditions are handled. The problem with OSC is that occurrence of the condition is unknown until a cache address is formed, which may be after the instruction has already been dispatched, requiring costly stall/reset mechanisms. For instance, when load store (LS) logic detects an OSC, it then rejects/recycles the load instruction and holds it from dispatching again until the storage data reaches a point where it can be bypassed (or read) by the load instruction. The penalty of such a reject can be many (e.g., 9) cycles of unused processing time. Compilers that generate the instructions typically try to distance instructions that store and load the same data sufficiently to minimize the OSC penalty. However, if the distance between the store and load is not large enough, the load can still be rejected in the processor pipeline. The distance between two instructions is defined by the number of cycles between dispatches of the two instructions. The distance is zero if the store and load are grouped and dispatched together, for instance, in a super-scalar architecture.
It would be beneficial to develop an approach to identify an OSC early in the pipeline to minimize associated delays. Such an approach should not require additional memory for storing accumulated instruction history, but take advantage of access to instruction text as it moves through pipeline stages. Accordingly, there is a need in the art for early instruction text based OSC avoidance.