Many scientific data processing tasks involve extensive arithmetic manipulation of ordered arrays of data. Commonly, this type of manipulation or "vector" processing involves performing the same operation repetitively on each successive element of a set of data. Most computers are organized with an arithmetic unit which can communicate with a memory and with input-output (I/O). To perform an arithmetic function, each of the operands must be successively brought to the arithmetic unit from memory, the functions must be performed, and the result must be returned to the memory. Machines utilizing this type of organization, i.e. "scalar" machines, have been found too slow and hardware inefficient for practical use in large scale vector processing tanks.
In order to increase processing speed and hardware efficiency when dealing with ordered arrays of data, "vector" machines have been developed. Basically, a vector machine is one which deals with ordered arrays of data by virtue of its hardware organization, rather than by a software program and indexing, thus attaining higher speed of operation. One such vector machine is disclosed in U.S. Pat. No. 4,128,880, issued Dec. 5, 1978 to Cray. The vector processing machine of the Cray patent employs one or more registers for receiving vector data sets from a central memory and supplying the same at clock speed to segmented functional units, wherein arithmetic operations are performed. More particularly, Cray provides eight vector registers, each adapted for holding up to sixty-four vector elements. Each of these registers may be selectively connected to any one of a plurality of functional units and one or more operands may be supplied thereto on each clock period. Similarly, each of the vector registers may be selectively connected for receiving results. In a typical operation, two vector registers are employed to provide operands to a functional unit and a third vector register is employed to receive the results from the functional unit.
Cray further provides single port memory connected to each of the vector registers through a data bus for data transfers between the vector registers and the memory. Thus, a block of vector data may be transferred into vector registers from memory and operations may be accomplished in the functional units using data directly from the vector registers. This vector processing provides a substantial reduction in memory usage, where repeated computation on the same data is required, thus eliminating inherent control memory start up delays for these computations.
Scalar operation is also possible in the Cray system and scalar registers and functional units are provided therefor. The scaler registers, along with address registers and instruction buffers are employed to minimize memory transfer operations and speed up instruction execution. Transfer intensity is further reduced by two additional buffers, one each between the memory and the scalar registers and address registers. Thus, memory transfers are accomplished on a block transfer basis which minimizes computational delays associated therewith.
Further processing concurrency may also be accomplished in the Cray system using a process called "chaining". In this process, a vector result register becomes the operand register for a succeeding functional operation. This type of chaining is restricted to a particular clock period or "chain slot" time in which all issue conditions are met. Chaining of this nature is to some extent dependent upon the order in which instructions are issued and the functional unit timing.
Thus, the system of U.S. Pat. No. 4,128,880 accomplishes a significant increase in processing speed over conventional scalar processing for the large class of problems which can be vectorized. The use of register to register vector instructions, the concept of chaining, and the use of the plurality of independent segmented functional units provides a large amount of concurrency of processing. Further, since the start up time for vector operations are nominal, the benefits of vector processing are obtainable even for short vectors.