1. Field of the Invention
The present invention relates to an RISC (reduced instruction set computer) type of microcomputer, and more particularly to an instruction distribution control device for controlling the distribution of instructions for parallel execution for use in a superscalar type of parallel processor which executes two or more instructions simultaneously.
2. Description of the Related Art
The mainstream method for data processing used in conventional data processors is the SISD (single instruction stream single data stream) in which instructions are processed in sequence one at a time. The requirements for increased efficiency in data processing has been met by increasing the width of data to be handled or the operating frequency. The requirements for further increased processing efficiency has been met by the use of a pipelining system which divides one process into several sections and processes two or more items of data simultaneously or by the addition of special-purpose hardware such as a floating-point arithmetic unit.
The MIMD (Multiple Instruction stream Multiple Data stream) system that executes two or more instructions simultaneously is effective in still further increasing the processing efficiency of a processor. This system uses two or more instruction executing arithmetic units and operates them at the same time. MIMD processors include array processors having an array of arithmetic units of the same arithmetic mode, superscalar types of parallel processor having two or more arithmetic units of different arithmetic modes and two or more pipelines.
The array processor is difficult to be applied to general data processing and thus suffers restriction on its applications. In contrast, the superscalar parallel processors are relatively easy to be applied to general data processing because their control system can be considered to be an expansion of the control system of conventional processors.
The superscalar parallel processor operates two or more instruction executing arithmetic units at the same time to execute two or more instructions in parallel during one clock cycle. In this case, the parallel processor shows instruction processing capabilities greater than those in conventional processors because two or more instructions are fetched, decoded, and then executed by the arithmetic units at the same time.
As a specific example of the superscalar parallel processor, there is a processor in which one floating-point arithmetic unit is added to two integer arithmetic units, and a total of two instructions, i.e., two integer arithmetic instructions (usual instructions in processors) or one integer arithmetic instruction and one floating-point arithmetic instruction, are executed in parallel (1991 IEEE ISSCC Digest of TECHNICAL Paper pp. 100-101, "A 100 MIPS, 64b Superscalar Microprocessor with DSP Enhancement" by Ran Talmudi, et al).
In the superscalar parallel processor, the number of instructions that can be executed at the same time is limited by the number of instruction executing arithmetic units that can be driven at the same time. The conventional processors process a set of instructions one by one, while, the superscalar processor processes a set of instructions N by N. The control system of the superscalar processor corresponds to an expansion of the control system of the conventional processors, and thus conventional processor programs can be used as they are without need of rewriting.
In other words, the superscalar processor could be realized by preparing an instruction decoder, which may be a conventional one, for each of arithmetic units, and adding a control function for parallel processing (an instruction distribution control function, in particular). The control function includes examination of whether or not parallel processing of instructions is possible (hereinafter referred to as dependence analysis), and allocation of instructions that can be executed in parallel to instruction executing arithmetic units. The instruction allocation control function is particularly important, which allocates instructions to appropriate instruction executing arithmetic units on the basis of the results of the dependence analysis.
Here, the necessity of the instruction allocation control function will be described. Even if a superscalar parallel processor system is configured such that as many as N (N&gt;1) instructions can be executed in parallel, N instructions cannot always be executed in parallel. That is, arithmetic units that have been prepared are not necessarily adapted for processing all of input instructions, and thus, when the number of arithmetic units adapted for processing some of input instructions is insufficient, they cannot be executed simultaneously with the other input instructions (this is called resource conflict). Thus it will be appreciated that as many as N instructions cannot always be executed in parallel. Further, even if the number of arithmetic units adapted for processing instructions is sufficient, and if data used for executing some instruction is prepared by execution of another instruction to be executed later, the instruction will not be executed until the data is prepared (this is called data conflict). The data conflict, unlike the resource conflict, is difficult to examine prior to execution of each instruction. Therefore, when the data conflict is found after instructions have been allocated, the execution of instructions is stopped.
When an instruction cannot be executed by reason of the resource conflict or the data conflict, an instruction which, of simultaneously distributed instructions, is behind the former in order cannot also be executed simultaneously.
Thus, when N instructions include some instructions that cannot be executed simultaneously, control is performed in a such a way as to execute all the instructions that can be executed first and then execute the remaining instructions. That is, the dependence analysis is made of each of N instructions. As a result of this, suppose that the i-th (i.ltoreq.N) instruction cannot be executed simultaneously with other instructions. First, the first to (i-1)st instructions are executed, and then the dependence analysis is again made of the (N-i+1) remaining instructions to execute the remaining instructions, including the instruction that could not be executed. After the termination of execution of N instructions, N succeeding instructions are executed.
However, conventional devices are relatively complicated in structure and relatively low in efficiency of the instruction distribution.
Further, conventional devices are relatively low in speed of operation for storing an instruction distribution starting position.
Moreover, conventional devices are relatively low in speed of operation for generating instruction distribution enable/disable signals from the results of resource conflict examinations.