1. Field of the Invention
The invention described in this patent application relates to the execution of instructions in digital computer systems and more specifically to the execution of conditional branch instructions in processors having pipelined instruction execution.
2. Description of Related Art: FIG. 1
The technique of pipelining has long been employed in the computer arts to speed instruction execution. In pipelining, execution of a sequence of instructions is divided into stages and execution of different parts of the same instruction or parts of different instructions proceeds in parallel. An example of such a pipelined system is the CPU of the VS 7000 family of computer systems built by Wang Laboratories, Inc. This CPU is described in detail in U.S. patent application Ser. No. 147,053, Information Processing System with Enhanced Instruction Execution and Support Control, David L. Whipple, filed Jan. 22, 1988, which in turn is a continuation of U.S. patent application Ser. No. 730,977, filed May 6, 1985. FIG. 1 is a block diagram of the VS 7000 CPU. The main components of that system for purposes of the present discussion are MEM 225, which is a cache in which instructions and data being processed by the VS 7000 CPU are loaded from the system s main memory and stored while being processed by the CPU, instruction queue (IQ) 103, a queue into which fixed sized portions (IP) 109 of an instruction stream consisting of instructions to be executed are loaded from MEM 225 prior to execution, address generator (AGEN) 115, which generates addresses for MEM 225 from address information contained in the instructions, and instruction interpreter (IINT) 111, which controls operation of the VS 7000 CPU by executing microcode in response to control information contained in the instructions. As may be seen from FIG. 1, data is carried from MEM 225 to IQ 103 by data bus (DBUS) 227, while addresses are carried from AGEN 115 to MEM 225 by address bus (ABUS) 121.
The VS 7000 series machines employ a virtual memory system, that is, a very large logical address space is mapped onto a much smaller physical address space. In such systems, virtual addresses, which specify locations in the logical address space, must be translated into physical addresses specifying the locations at which the actual data or instructions are presently located in the physical address space. Thus, generation of an address in the VS 7000 series machines involves two steps: computing a virtual address and converting the virtual address to its corresponding physical address. VACOMP 112 is the component of AGEN 115 which computes virtual addresses; it is connected by VABUS 113 to virtual address register (VAR) 114 and to physical address translator (PATRANS) 116. VAR 114 operates under control of IINT 111 to retain virtual addresses placed by VACOMP 112 on VABUS 113. PATRANS 116 translates the virtual address to its corresponding physical address and, as determined by microcode operating in IINT 111, outputs the physical address to either instruction address generator (IAG) 119, which generates the address of the next portion of the sequence of instructions (IP) 109 to be loaded into queue tail (QT) 107 of IQ 103, or address register (AR) 117, which provides operand addresses to memory 225.
The pipeline of the VS 7000 system divides instruction execution into three stages: fetching an IP 109 from MEM 225 and placing it in QT 107 of instruction queue 103, generating addresses from the address information in the instruction syllables currently in queue head (QH) 105, and actually executing the instruction using control information received from QH 105 and the data addressed in the address generation stage in IINT 111. Since the fetching, address generation, and interpretation operations proceed in parallel, the VS 7000 CPU may within a single CPU cycle interpret one instruction in IINT 111, generate an address from the instruction or portion thereof currently in QH 105, and load another IP into QT 107. As is implicit in the above, there are two kinds of addresses generated by AGEN 115: addresses for IPs 109 and addresses of operands, i.e., data used in instruction execution. Corresponding to these two kinds of addresses, there are two components which receive addresses in parallel from AGEN 115. Addresses to be used for operands are received in address register (AR) 117; addresses to be used to generate instruction addresses are loaded into instruction address generator (IAG) 119. Once loaded, IAG 119 operates independently of AGEN 115 to increment the address with which it is loaded to obtain the address of the next fixed-length instruction portion to be loaded into QT 107. The address is provided to MEM 225 when there is room in IQ 103 and MEM 225 is not providing operands.
Operation of the pipeline is controlled by IINT 111. The following pipeline operations are of particular interest: dispatch, advance, and the memory operations fetch and data read. IINT 111 performs the dispatch operation on the last cycle of execution of an instruction. At that point, the first portion of the next instruction to be executed is in QH 105. The operation involves the computation of a virtual address and generation of a physical address in AGEN 115, updating of program counter registers in VACOMP 112, provision of the instruction or a beginning portion thereof to IINT 111, and advancement of IQ 103. At the end of the dispatch operation, VAR 114 contains the virtual address generated from the contents of QH 105, AR 117 contains the physical address corresponding to the virtual address, the program counter registers point to the instruction whose execution begins in the next cycle and to the following instruction, and the contents of IQ 103 which follow the original contents of QH 105 are in QH 105. IINT 111 performs the advance operation when it is interpeting an instruction which is longer than the length of QH 105. At that point, the operation code, including the format information for the instruction, is in IINT 111. Advance works as does dispatch, except that the amount by which IQ 103 is advanced is under microcode control, as is whether the virtual address computed from the contents of QH 105 at the beginning of the operation is retained in VAR 114 and the corresponding physical address in AR 117. Moreover, the contents of QH 105 are not loaded into IINT 111, and the program counter registers in VACOMP 112 are not updated.
The memory operations provide physical addresses via ABUS 121 to MEM 225 and receive instructions or data from MEM 225 in the CPU. The fetch operation uses a physical address provided by IAG 119 and loads QT 107 in IQ 103. The data read operation uses a physical address provided by AR 117 and provides the data to an input register in the CPU. With both operations, the data or instruction is available to the CPU by the end of the microcycle in which the memory operation is performed.
While the pipeline in the VS 7000 type CPU of FIG. 1 is very effective when executing a continuous sequence of instructions, its advantages become disadvantages when a conditional branch instruction must be executed. A conditional branch instruction is an instruction which causes program execution to branch to a target instruction at an address defined in the conditional branch instruction if a condition specified in the instruction is satisfied. Of course, when the branch is taken, the address of the target instruction must be computed, the program counter registers and IAG 119 must be set to point to the instruction sequence beginning with the target instruction, the target instruction must be loaded into the head of IQ 103, and any other instructions in IQ 103 must be discarded.
FIG. 1A shows one of the types of conditional branch instructions executed by the VS 7000 type CPU. Branch on condition instruction 123 is 32 bits long; the first 8 bits are an operation code (OC) 127 specifying the branch instruction; of the 8 operation code bits, bits 0-1 are an instruction type code (IT) 125 which specifies the instruction's format. Instruction 123 has the RX format, which means that it is 32 bits long and that the branch address is computed by adding a base value and a displacement value to an index value. The index and base values are kept in registers in VACOMP 112 which are specified by the values of IR field 131 and BR field 133 respectively in the instruction. DISP field 135 contains the displacement value. CMASK field 129, contains a mask for selecting the condition code bits whose values will determine whether the branch is taken.
Execution of conditional branch instruction 123 involves the following steps. Steps 1 and 2 are performed in every branch instruction; steps 3 and 4 are performed only if the branch is taken.
1. On the last cycle of execution of the instruction preceding the branch instruction (the syllables making up the branch instruction are in QH 105): performing a dispatch operation. At the end of the dispatch operation, the virtual address of the target instruction is in VAR 114, the physical address is in AR 177, the branch instruction is in IINT 111, and at least the beginning of the instruction following the branch instruction is in QH 105.
2. On the next cycle, which is the first cycle of execution of the branch instruction: testing the condition code using the mask specified in the instruction to determine whether the branch is to be taken. If the branch is not to be taken, performing a dispatch operation. If the branch is to be taken, providing the address saved in VAR 114 to PATRANS 116 to generate the physical address of the target instruction, updating the registers in VACOMP which specify the program counter, and loading the physical address into IAG 119.
3. On the second cycle of execution of the branch instruction, setting QT 107 equal to QH 105, providing the physical address in IAG 119 to MEM 225, and loading the portion of the instruction sequence specified by IAG 119 into QT 107. The effect is to load the portion of the instruction sequence beginning with the target instruction into QH 105.
4. On the third cycle of execution of the branch instruction, performing a dispatch operation.
As may be seen from the foregoing, execution of a branch which is not taken requires one cycle from the time the branch instruction is loaded into IINT 111, while execution of a branch which is taken requires three cycles, even though the branch address is available at the time the branch instruction is loaded into IINT 111. An object of the present invention is to improve on prior art pipelined systems by providing a pipelined system wherein the execution of a branch which is taken requires at most two cycles.