The instruction processing rate of a large, high-performance digital computer is generally achieved by designs which employ various means of instruction pre-fetch, decoding, and execution overlap. Many provide a maximum processing rate of one instruction per cycle, because only a single instruction can be decoded during a cycle. The maximum processing rate of large, general purpose digital computers can be affected by the ability to pre-fetch instructions. Pre-fetching means that instructions are fetched from a main storage system prior to the time they are required for processing. These instructions are held in a high-speed local store normally referred to as an instruction buffer. When needed, instructions are gated out of the instruction buffer, one at a time, into an instruction register for decoding and main store address generation. After decoding and address generation, the instruction is then gated into an operation register to be utilized by an execution unit during instruction processing.
An instruction buffer might have a capacity of up to four doublewords (one word equals 32 binary bits). If the high-performance data processing system is designed to operate with instructions having different lengths, such as defined in the IBM System/370 Principles of Operations, Form No. GA22-7000, an instruction buffer of four double words is capable of storing approximately 8 instructions.
Any stored program computer is designed to function with a predetermined number of previously defined program instructions which accomplish prespecified functions in the data processing system. Programs are written utilizing the instruction set. A programmer may desire that certain functions ultimately be accomplished, and he is required to create sequences of program instructions from the instruction set available to him. When the programs have been written, and particular functions are seen to occur frequently, it might be desirable to define a new program instruction which, through a single decoding operation would accomplish the desired overall function. However, data processing systems already designed and in use, would not be able to recognize the newly defined instruction, and therefore not gain performance improvement. Even if a new data processing system were designed to recognize a new instruction for accomplishing an overall function, the vast number of previously written programs dealing with system control or applications, would not benefit from the increased performance, and would have to be rewritten to utilize the newly defined instruction.
Instructions defined in the IBM System/370 Principles of Operation, include an 8-bit Op-Code field, and a number of other 4-bit fields which provide program addressable access to 16 general purpose registers (GPR's). The 16 GPR's can be programmed to temporarily store operand data, base address values, or address index values. The general register designated as a base address register or address index value is utilized during instruction decoding phases to generate, or calculate, main storage addresses. The address generation phase may include the addition of a base address value from a general register, an address index value from a general register, and a 12-bit displacement address value contained in the instruction.
In data processing systems which have been designed to provide instruction decoding and execution overlap, a performance degradation is realized when a first instruction, during execution, effects a change of the data stored in a general register which is to be utilized in a next instruction as address information. This is known as "address generate interlock", and is degrading because the instruction following the first cannot enter the instruction decode and address generate phase until the previous instruction has been executed.