The present embodiments relate to reducing operand store compare (OSC) penalties, and more particularly to comparing address determination fields of cracked load and store units of operation (UOPs) to detect potential dependencies.
A processor accesses operands according to instruction-defined methods. The instruction may provide an immediate operand using the value of a portion of the instruction, or may provide one or more register fields explicitly pointing to either general purpose registers or special purpose registers (floating point registers for example). The instruction may utilize implied registers identified by an opcode field as operands. The instruction may utilize memory locations for operands. A memory location of an operand may be provided by a register, an immediate field, or a combination of registers and immediate field as exemplified by International Business Machine's (IBM) z/Architecture™ long displacement facility wherein the instruction defines a base register, an index register and an immediate field (displacement field) that are added together to provide the address of the operand in memory. A value of zero in the base or index field specifies that no base or index is to be applied, and thus, a general register is not to be designated as containing a base address or index.
Reducing cycles per instruction (CPI) improves processor performance. CPI may be increased by operand store compare (OSC) penalties which occur when an instruction with an operand to store data to memory is followed by an instruction to load the data from the same memory location before the data actually reaches the memory. The memory may be indicated by the implied registers, memory locations, immediate fields, or combinations of registers and immediate fields indicated in the opcode of instructions. One problem with handling of OSC is that it may not be recognized until after a memory access address is formed, which may occur after the instruction has been issued. Costly processor pipeline stall, recycle and reset mechanisms are necessary to remedy an OSC. The stall and reset mechanisms may result in a penalty of many cycles of unused processing time.
In a system with processors capable of out-of-order (OOO) processing, in which instructions may be processed out-of-order relative to the order in which they are fetched and decoded, the OSC penalties may increase, since an incorrect loaded value obtained by a load instruction subject to the OSC may have been used in downstream instructions.