The present invention relates to a vector processor having a plurality of data transfer circuits, a plurality of vector registers and a plurality of vector arithmetic units.
In conventional vector processors, in order to make the processing speed fast by simultaneously executing a plurality of vector instructions, there has been provided a plurality of vector arithmetic units (hereinafter, simply referred to as arithmetic units) and/or a plurality of data transfer circuits (hereinbelow, simply referred to as data transfer circuits) which serve to perform the data transfer between a main storage and the vector registers. However, in the case where the number of instructions of a vector instruction group which constitutes the actual vector processing is small, these plurality of arithmetic units and data transfer circuits are not simultaneously used, so that there is a problem in terms of effective use of resources in the system.
For solving such a problem, it may be considered that, in a vector processor having two arithmetic units, the two arithmetic units may be regarded as one arithmetic unit from a software point of view and the operation for the vector element bearing even numbers is executed by one arithmetic unit when executing vector arithmetic instructions, while the operation for the vector elements bearing odd numbers is simultaneously executed by the other arithmetic unit, thereby to improve the processing speed to about twice the normal speed. This increase in processing speed is effective in case of a simple vector instruction (hereinafter, referred to as a simple instruction) consisting of only the operations between/among a couple/group of data which have the same element number, such as for example a vector addition instruction EQU A(i)=B(i)+C(i)
(where, i=0, 1, 2, . . . , n).
However, in addition to the above-mentioned simple instruction, there are also complicated instructions which require the operations between/among a couple/group of data which have different element numbers (hereinafter, referred to as a macro instruction), such as the following iteration instruction EQU A(i+1)=A(i)*B(i)+C(i)
(where, i=0, 1, 2, . . . , n).
As described above, when executing a macro instruction using a vector processor which is constituted in such a manner that the vector processings are carried out by dividing the vector elements bearing even numbers and the vector elements bearing odd numbers, a data bus is needed between two arithmetic units, resulting in complexity of control; therefore, it is actually impossible in practice to realize such an arrangement. Consequently, in conventional vector processors, anyone of the following methods may be adopted, that is, a method wherein no macro instruction is supported, a method wherein it is not adopted to divide the vector into the elements bearing even numbers and into the elements bearing odd numbers and thereby to carry out the processings as described above; or a method wherein macro instructions are processed as scalars instead of vectors. However, the above methods are unsatisfactory with respect to high speed processing, performance, or the like.