The present invention is in the field of digital computers that perform scalar and vector operations. More specifically, the invention concerns the use of bi-directional data busing in a computer that superposes scalar and vector data operations.
The concepts of scalar and vector organization of digital computing machines are well understood in the art. The family of computers marketed by International Business Machines Corporation under the name System 370 is representative of scalar computing, while the Cray-1 computer available from Cray Research, Inc. exemplifies vector processing machines.
Scalar processing refers to the performance of an arithmetical or logical function on a pair of unidimensional, operand data objects to produce a single result. To perform the function, the operands must be retrieved from memory, provided to the functional unit, and the result returned to memory. Each step in the compound sequence--extraction of the operands from memory, transfer of the operands to registers, performance of the operation, transfer of the result back to memory--requires execution of one or more instructions.
The architecture and operation of a vector computer are based on the observation that an identical function can be performed on each of a succession of related data objects in response to a single instruction. A vector machine responds to a vector instruction by beginning a process of streaming a plurality of related unidimensional data objects, referred to collectively as a vector, from one point to another. Vector computers are characterized in that at least two operand vectors can be streamed to the same functional unit in response to a single instruction, such as an instruction to add two vectors. The functional unit combines the streamed vectors in some predetermined way to produce a single result vector.
It has been recognized that highly efficient computing can be realized by an architecture containig both scalar and vector capabilities. The integration of scalar and vector structures into a single construct is referred to as superposition. Superposition results in a machine that can use software compatible with existing scalar and vector systems to achieve the speed inherent in vector processing, while enjoying the precision and flexibility of scalar processing.
The concept of superposition can be appreciated when the following Fortran programming fragment is considered: EQU DO 5, I=1, 64 EQU A(I)=(B(I)+C(I))*D(I) EQU 5 CONTINUE
In this fragment, three data objects, B(I), C(I) and D(I), are combined in each of a series of 64 functionally identical steps. In a scalar computer, each of the 64 iterations would involve obtaining a pair of indexed operands from storage, performing the indicated functions, and returning the results, A(I), to storage. A vector computer would treat each of the operands B, C, and D, and the result A as respective n-unit vectors, for example. Then, the vector computer would obtain each of the n-unit operands B and C from memory in one or more streaming operations; initiate a second streaming operation to concurrently provide the B and C operands to a functional unit to be added; and store the result of the addition as an n-unit result vector in a third streaming operation. During the time that the operands B and C are being added, the third operand D would be obtained from memory and held in temporary storage to be combined with the temporary result of adding B and C. The result of multiplying the 64-unit D vector with the n-unit temporary result vector would be returned to storage as the n-unit A vector. A computer superpositioning scalar and vector operations in this example would employ concurrent scalar and vector operations, using the scalar operations to calculate the starting and ending indices for the operand and result vectors, and vector techniques to perform the memory references and functional operations on the indexed vectors.
A computer architecture integrating scalar and vector operations characteristically operates on data elements or words of predetermined length. For example, the unit data element in the above-referenced Cray-1 computer consists of a 64-bit word; scalar data objects usually include single words, while vector data objects can consist of ordered arrays of from 1 to 64 (or more) words.
Typically, a computer which superpositions scalar and vector operations has a main memory consisting of a plurality of interleaved individual memory units in which multiple units can be concurrently accessed by simultaneous processes for storage or retrieval. Superposition also requires the provision of a plurality of functional units, each for performing a specified arithmetic or logical function. Normally the functional units have pipelined structures which permit them to receive a set of operands for one operation while still processing a set of operands for a previous operation. Such functional units are normally used for vector operations, although some are used also for performance of floating point operations on scalar operands. The superpositioning computer typically also provides dedicated arithmetic and logic units for scalar (and address) operations. Finally, the architecture of the superpositioning computer normally includes a bank of scalar and vector registers that act as buffers or caches between memory and functional units. The primary purpose of positioning buffer registers between memory and the functional units is to reduce memory access time in scalar operations and to increase memory throughput in vector operations.
A scalar buffer obtains from memory and temporarily stores scalar data objects demanded by currently-executing processes: the scalar buffer includes registers that are characteristically faster than main memory. The vector buffer has registers which reduce the access time to main memory and also increase the throughput through the memory by eliminating the need to store intermediate results of vector operations: retention of the intermediate results of a vector operation in the buffers permits the allocation of more of the memory bandwidth to the provision of primary operands and storage of final results.
The Cray X-MP provides the most current example of a computer which superposes scalar and vector operations and which has an architecture including the elements previously discussed. In this machine, connectivity between the architectural blocks is provided on unidirectional data paths. For each operational section of the computer there is a separate set of data paths. Thus, for vector operations there is a set of data paths between the main memory and the vector registers and another set of data paths between the vector registers and each of the functional units. Further, in each set of data paths there are unidirectional paths for conducting vector data objects from the vector registers and other, separate data paths for conducting vector data objects to the vector registers. The proliferation of unidirectional data paths in the prior art superpositioning computers exacts significant physical and economic penalties. First, the physical design of such computers requires the provision of a plethora of point-to-point data paths and expands the physical resources required to interconnect the architectural blocks of the computers. Second, the cost of physical resources for such an interconnection adds significantly to the total cost of the computer.
Interconnection technology in purely scalar computers in which the primary architectural blocks of the computer are connected and over which all memory, operand, and result transfers are conducted includes the use of a bi-directional databus. Such a data transfer structure is inapplicable to the superpositioning computer because it does not permit the scalar and vector processes to have independent, but concurrent data paths to and from memory. Such an independent memory data connection structure is necessary because it permits the scalar section to fully perform all indexing calculations without interfering with the concurrently-operating vector processes that require the indices. Further, a traditional scalar databus, as is presently used, would do away with the response time and throughput benefits provided by scalar and vector buffering.
Thus, there is an evident need in the field of superpositioning computers to reduce the amount of physical resources required to support data interconnection, which need is not fulfilled by traditional databus structure upon which all data transfers in a computer are made.
It is therefore the primary objective of the present invention to advance a data transfer system for use in a computer which superposes scalar and vector operations, which will reduce the physical dimensions of the data interconnection resources of the computer, yet which will retain the benefits of quick response time and efficient throughput realized by the computer's architecture.