As computer system designers seek to continually improve processor performance, it is beneficial to develop approaches that increase IPC (Instruction per cycle) through optimizing the rate of instruction processing and increasing the throughput of data. This is especially true for instructions with variable operands length such as variable operand length s storage-to-storage instructions (or SS ops). However, conventional systems generally experience large overhead with respect to the startup sequences of these types of instructions, which reduces the system performance. For example, some conventional systems execute SS ops within a load storage unit (LSU) using a sequence (which occupies both LSU pipelines for the duration of the op) similar to the following. In first LSU pipeline a destination operand starting address pretest is performed while in a second LSU pipeline a source operand starting address store pretest is performed. Then in the first LSU pipeline a destination operand ending address store pretest is performed while in the second LSU pipeline a source operand ending address store pretest is performed. Subsequently in both pipelines operand data streaming (1 to 256 bytes) is performed.
With respect to an SS op instruction such as an MVC (move character) type instruction two double-words of source operand are read from the D-cache each cycle and written into the store buffer. During data streaming phase for arithmetic SS instructions, such as an O character (OC) instruction, N character (NC) instruction, exclusive OR character (XC) instruction, etc., one double-word of the source operand and one double-word of the destination operand are read from the D-cache each cycle, the specified arithmetic operation is performed and the result is written into the store buffer. The above conventional processing of variable operands length instructions generally results in a large overhead with respect to the startup sequence (including store pretests) for “short” sequences. This overhead is generally much larger than the actual operand streaming. Previously, guidelines have been established for compilers and software to use separate load, store (and arithmetic) instructions for short sequences. Unfortunately, the definition of short varies from machine to machine.