The present invention relates to a central processing unit for a computer and, more particularly, to a technique for fetching instructions or data required for the operation of the central processing unit from a macroinstruction memory which stores such instructions or data.
The processing speed of a central processing unit (to be referred to as an CPU hereinafter) is generally faster than the fetching speed. Thus, when the quantity of one processing operation of the CPU (the number of bits processed by one execution operation of the CPU) is the same as the quantity of one fetching operation (the number of bits which may be moved from the macroinstruction memory to the CPU by one fetching operation), the total computing speed is lessened by the fetching speed even when the CPU operates at a faster speed. Two methods have been conventionally proposed for overcoming this problem. The first method proposes to make the quantity of one fetching operation of the CPU the same as or larger than the quantity of one processing operation. Expansion of the memory bus width, for example, may be included in this method. However, expansion of the memory bus width is disadvantageous in that the control becomes complex and the cost increases.
The second method proposes to add a memory or a register of small capacity having a fetching speed corresponding to the processing speed of the CPU, and to fetch in advance (to be referred to as prefetching hereinafter) in the added memory (to be referred to as a cache memory) or the register the macroinstruction or data which is to be processed by the CPU.
This second method may further be classified into the cache memory system and the pipeline register system. Since the cache memory is expensive, the cache memory system presents a problem of cost with small computer systems. The present invention concerns the pipeline register system within the prefetching system.
The conventional prefetching system may be classified into the following: ##STR1##
Since the prefetching operation of the data in the cache memory system and the pipeline register system is not directly related to the present invention, the description thereof will be omitted. The present invention thus relates to the branch tree-considering system for performing the prefetching operation of instructions in the pipeline register system. When a judge instruction is included in the instruction sequence, the instruction sequence to follow is branched into a plurality of parts called the branch tree. When the judge instruction is executed, one branch tree is selected according to the judgement and the other branch trees become unnecessary. According to the branch tree-considering system, all the branch trees after the judge instruction are prefetched. According to the branch tree-non-considering system, only one branch tree is prefetched.
According to the latter method, when the prefetched branch tree is not the one which is required, another branch tree must be fetched, reducing the prefetching effects. Although the branch tree-considering system is used for large computers, the non-considering system is usually used for medium or smaller computers due to cost limitations.
The general method for performing the pipeline control will be described. The term "pipeline" means a vertical series of logic units Li (where i=1, 2, . . . , n) of processing time t. The pipeline control is a processing system according to which data (including instructions) are sequentially supplied from the input terminal and obtained from the output terminal at each time interval t after being processed in n stages by the Lis. In order to operate at a high speed, it is necessary to perform parallel processing of instructions by overlapping the processing steps as shown in FIG. 1. This is called the pipeline control system. Referring to this figure, symbol L1 denotes a reading step of an instruction; L2, a decoding step of the instruction; L3, an address computing step; L4, an operand reading step; and L5, an instruction executing step.
The arithmetic and logic unit for the pipeline control may be generally represented as shown in FIG. 2. The functional relationships of these units are shown in FIG. 3. The operation procedures in this figure are as follows:
(1) An instruction readout request is output from a unit P to a unit B.
(2) If the requested instruction is present in a buffer memory (BM), the unit B sends it to the unit P. If it is not present there, the unit B outputs a request to a unit F and sends the requested instruction to the unit P.
(3) The unit P decodes the instruction and outputs a request for the address computation of the operand to a unit A.
(4) The address is sent from the unit A to the unit B to read out the operand.
(5) The unit P outputs an operation instruction to the unit E.
(6) The operation result is stored in a main memory through the Unit B or F according to the instruction from the unit E.
The flow of the above procedures may be represented by the detailed block diagram of FIG. 4. The operations in this figure are as follows: