The present invention relates to high speed digital processors, and more particularly to computing machines adapted for vector processing.
There are many circumstances in problem solving with computers where it is necessary to perform the same operation repetitively on each successive element of a set of data.
To solve such a problem one prior art technique provides vector processing apparatus for a computer, which allows the processing of a plurality of elements of an ordered set of data. Cray. Jr., et al in U.S. Pat. No. 4,128,880, describes an example of such vector processing apparatus. In this apparatus, referring to FIG. 2 of U.S. Pat. No. 4,128,880, vector processing in a computer is achieved by means of a plurality of vector registers 20 (V.sub.0 -V.sub.7), a plurality of independent fully segmented vector functional units and means for controlling the operation of the vector registers, including fan-outs 22 and 23 for selecting a signal, a data path 21 and a memory 12. Each of vector registers V.sub.0 -V.sub.7 has 64 individual elements, each of which can hold a 64 bit word. When the apparatus executes the partial vector processing of the element data in the vector register V.sub.0, it is necessary to move at least one portion of the data in the register V.sub.0 to another register V.sub.1. To accomplish this movement, element data is moved between the vector registers V.sub.0 -V.sub.7 and the memory 12 by store/load instructions, or by a shift instruction. When moving by store/load instructions, element data in the register 20 are sequentially stored in the memory 12 via the fan-out 22 and data path 21 by store instructions, and a portion of the element data in the memory 12 are then loaded to the register V.sub.1 via the fan-out 22.
When moving by shift instructions, the element data in the register V.sub.0 is sent to the shift functional unit via the fan-out 23 by a shift instruction. The shift functional unit can perform a shift in accordance with a shift quantity designated by the instruction. The output of the shift functional unit is moved, shifted by one word, to the vector register V.sub.1 via the fan-out 23. The desired movement of element data is accomplished by repeating this shift operation. Accordingly, since both techniques need either the memory 12 or the shift functional unit, the performance of element data movement becomes slow. In addition, when the next instruction needs the memory 12 and/or the shift functional unit, a conflict in using these devices has occurred.