1. Field of the Invention
The present invention relates to processing units in digital computer systems and more particularly to processing units including instructions prefetch apparatus for fetching an instruction into a processing unit while another instruction is being executed.
2. Description of the Prior Art: FIG. 1
A typical computer instruction includes an operation code specifying an operation and operand specifiers specifying the data to be used in the operation. The execution of such a computer instruction typically involves four different kinds of operations: fetching the instruction from memory, computing the addresses of data in memory, fetching the data from memory, performing the operation specified, and storing the result in a memory location. Of these operations, only fetching the instruction, fetching data from memory and storing data in memory involve the memory. The prior art has long taken advantage of this fact to overlap the execution of the current instruction and the fetching of the next instruction. Whenever the memory is not required during the execution of the current instruction, apparatus termed a prefetcher fetches following instructions from the memory into the prefetcher, where they are retained until used in the CPU. Since the next instruction is available in the prefetcher immediately upon completion of execution of the current instruction, there is no time lost fetching the next instruction from memory and execution of instructions is considerably speeded.
The design of prefetchers is considerably simplified by two characteristics of instructions: they generally are made up of syllables with a uniform size, and they are executed in the sequence in which they are stored in memory unless one of the instructions in the sequence is a branch instruction. Thus, the prefetcher can compute the address of the next syllable to be fetched by simply incrementing the address of the last syllable to be fetched by the syllable size. Because the prefetcher operates in this fashion, the syllables it contains have the same order which they have in memory However, when the execution of a branch instruction causes a program's instructions to be executed in an order other than the one the instructions have in memory, the prefetched instructions must be discarded and the prefetching must begin again with the instruction specified in the branch instruction.
FIG. 1 is a schematic block diagram of a digital computer system 101 with a prior-art prefetcher. The chief components of system 101 are memory (MEM) 105 and CPU 103. MEM 105 contains a program, PROG 107, consisting of a sequence of instructions (INST 109), and data, DATA 111. The instructions are divided into syllables, the first of which contains the operation code. CPU 103 receives data and instruction syllables from MEM 105 via DBUS 113 and provides the addresses of the data and instruction syllables to MEM 105 via ABUS 133. The principal components of CPU 103 are data registers (DREGS) 117, ALU 125, address generator (AGEN) 127, prefetcher (PREF) 115, instruction register (IREG) 121, and control (CTL) CTL 123. All components are controlled by CTL 123. DREGS 117 ar made up of a number of registers (REG) 119. Individual REGs 119 are specified in the following by means of values in parentheses, for example REG(a) 119. REGs 119 may contain DATA 111 from MEM 105 or data generated in the course of the internal operation of CPU 103. DREGS 117 is connected to DBUS 113, and consequently may receive data from or provide it to MEM 105. ALU 125 performs arithmetic and logical operations on data received from DREGs 117, addresses received from AGEN 127, and immediate values received from IREG 121. The results from ALU 125 go to DREGs 117 and AGEN 127. AGEN 127 generates addresses for fetching data and instruction syllables. Because CPU 103 contains a prefetcher, AGEN 127 has at least one register usable for addresses of data, DA 131, and a register usable for addresses of instruction syllables, IA 129. IA 129 operates under control of PREF 115, and DA 131 under control of CTL 123. Addresses from AGEN 127 are provided to MEM 105 via ABUS 133. PREF 115 is connected to DBUS 113 and receives syllables of INSTs 109 via that bus from MEM 105. The syllables are stored in prefetch queue (PREFQ) 116 in the order in which they are fetched from PROG 107 and are read from PREFQ 116 in the same order. IREG 121 contains INST 109 currently being executed in CPU 103. In FIG. 1, that instruction is INST 109 (d). The first syllable of the next instruction, here represented as INST 109 (d+1), is at the head of PREFQ 116. IREG 121 is connected to ALU 125 and CTL 123. The operation code of the current INST 109 goes to CTL 123; immediate values and values used to calculate addresses go to ALU 125. CTL 123, finally, responds to the current INST 109 by providing control signals to the other components as required to perform the operation specified by the instruction.
Operation of PREF 115 in CPU 103 is as already described in general. During execution of an instruction, PREF 115 detects when DATA 111 is not being read from or written to 10 MEM 105 and causes AGEN 127 to output IA 129 to ABUS 133. The syllable of PROG 107 specified by IA 129 is put at the end of PREFQ 116 and IA 129 is incremented to specify the next syllable in PROG 107. The prefetching continues as described until PREFQ 116 is full or until the execution of the current INST 109 results in a branch. In that case, the contents of PREFQ 116 are discarded and IA 129 is set to the address of the next instruction to be executed. Since PREFQ 116 is empty. CPU 103 must wait to begin execution of the next instruction until it has been loaded into PREF 115. As CPU 103 executes the next instruction, other instructions are loaded as previously described.
Prior-art prefetchers have added greatly to the speed of operation of digital computer systems, but have been difficult to design and expensive to build. The design difficulties and expense have been primarily due to the fact that prior-art prefetchers have operated essentially independently of other components of CPU 103. They have consequently required complicated logic to detect when MEM 105 is free to provide INSTs 109, to detect when PREFQ 116 is full or empty, and to deal with branches in the program. A further disadvantage of prior-art prefetchers has been that they have treated all instruction syllables in the same fashion, even though the first syllable of most instructions is functionally quite distinct from the remaining syllables. Descriptions of such prior-art prefetchers may be found at Col. 203 of Bratt, et. al, Digital Data Processing System Utilizing a Unique Arithmetic Logical Unit . . . , U.S. Pat. No. 4.445,177, issued Apr. 24, 1984 and in Grondalski, Apparatus for Fetching and Decoding Instructions, U.S. Pat. No. 4,462,073, issued July 24. 1984.
Another disadvantage of prior-art prefetchers has been their unfavorable effect on the execution of instructions of the EXECUTE type. This type of instruction, exemplified by the EX instruction of the well-known IBM 360 instruction set, specifies that a single instruction, termed a subject instruction, whose location is specified in the EX instruction, is to be executed, and that when the subject instruction's execution is complete, the instruction following the EX instruction is to be executed. In CPUs with prior-art prefetchers, the EX instruction has been treated as a branch and the execution of the EX instruction and the subject instruction have proceeded as follows: on execution of the EX instruction, IA 129 has been reset to specify the subject instruction and the contents of PREF 115 have been discarded; the subject instruction has then been fetched into PREF 115 and executed; thereafter, IA 129 has been reset to specify the instruction following the EX instruction, the contents of PREF 115 again discarded, and the instruction following the EX instruction fetched into PREF 115. All of this has been done even though by definition, the next instruction to be executed after the subject instruction is the instruction following the EX instruction, and consequently, the prefetcher will work properly if IA 129's value is not changed during execution of the subject instruction.