1. Field of the Invention
This invention relates to vector processors, and more particularly to a vector processor which enables easy microprogramming for vector processing.
2. Description of the Prior Art
Vector processing, or repetition of the sametype operation on successive data elements as represented by the calculation of a matrix, Fourier transformation and so on, can be performed at high speed by, for example, a known pipelined vector processor. In this pipeline processor, there are provided plural stages for arithmetic processing, and while one data element is processed through the stages in order, another data element is also processed through the stages, following the former data element. The U.S. Pat. No. 4,075,704 to O'Leary, for example, discloses a vector processor including a pipeline adder and a pipeline multiplier. The pipeline adder includes 2 stages, each performing part of an addition in one clock cycle, and the pipeline multiplier includes 3 stages, each performing part of multiplication in one clock cycle.
When such pipeline processor is used to perform a vector processing, the arithmetic operations on different data performed in an overlapped manner (multiple parallism) to the extent as possible, and the shortest operation loop is used, so that arithmetic operations can be performed at high speed. Such arithmetic operations will be described with reference to, for example, a simple vector processing written in the following FORTRAN language: ##EQU1##
The above example shows a very simplified function vector processing such as for a sine function.
The program (1) is a combination of additionmultiplication-addition, and becomes a 7-stage processing if it uses a 2-stage adder and a 3-stage multiplier. One vector component is calculated at 8 stages by adding one output stage thereto. In a pipeline arithmetic operation, it is possible to the start operation for a new data element (index I) at each clock cycle.
However, it is necessary to temporarily store an intermediate result F(I) in program (1). Before the read-out of an intermediate result for index I which has been once stored, the next intermediate result for index (I+1) may be required to be stored. In program (1), the result of two-stage addition for index (I+1) is obtained during the period of three-stage multiplication for index I. Thus, it is necessary to temporarily store these intermediate results at two places. This results in the necessity for distinguishing the addresses of the memory for the odd-numbered index and even-numbered index. Thus, a loop must be formed for a group of operations on the (I+1) and (I+2) data elements, as a unit.
For this reason, in the conventional processor, the number of steps for repetitive loop of the microprogram becomes large for discrimination of addresses for temporary storing, and the contents of the steps become complex. This problem may not be so important in a very simple program, but becomes serious in a general function operation and the like because the repetitive loop may include several tens steps or more. In addition, when the number of steps is increased to twice, three times or more, the microprogram itself and the capacity of the control memory for storing the microprogram should be increased correspondingly, which causes a serious problem.