Two types of processor architectures are widely recognized in the field of computer science: “scalar” and “vector”. A scalar processor is designed to execute instructions that perform operations on a single set of data, whereas, a vector processor is designed to execute instructions that perform operations on multiple sets of data. FIGS. 1A and 1B present a comparative example that demonstrates the basic difference between a scalar processor and a vector processor.
FIG. 1A shows an example of a scalar AND instruction in which a single operand set, A and B, are ANDed together to produce a singular (or “scalar”) result C (i.e., AB=C). By contrast, FIG. 1B shows an example of a vector AND instruction in which two operand sets, A/B and D/E, are respectively ANDed together in parallel to simultaneously produce a vector result C, F (i.e., A.AND.B=C and D.AND.E=F).
As is well known in the art, typically, both input operands and output result are stored in dedicated registers. For example, many instructions will have two input operands. Therefore two distinct input registers will be used to temporarily store the respective input operands. Moreover, these same instructions will produce an output value which will be temporarily stored in a third (result) register. Respective input 101a,b and 102a,b and result registers 103a,b are observed in FIGS. 1A and 1B. Notably, the “scalar” vs. “vector” characterizations are readily discernable.
That is, input registers 101a and 102a of the scalar design of FIG. 1A are observed holding only scalar values (A and B, respectively). Likewise, the result register 103a of the scalar design of FIG. 1A is also observed holding only a scalar value (C). By contrast, the input registers 101b and 102b of the vector system of FIG. 1B are observed holding vectors (A,D in register 101b and B,E in register 102b). Likewise, the result register 103b of the vector system of FIG. 1B is also observed holding a vector value (C,F). As a matter of terminology, the contents of each of the registers 101b, 102b and 103b of the vector system of FIG. 1B can be globally referred to as a “vector”, and, each of the individual scalar values within the vector can be referred to as an “element”. Thus, for example, register 101b is observed to be storing “vector” A, D which is composed of “element” A and “element” D.
Given that vector operations correspond to the performance of multiple operations performed in parallel, a problem can arise in vector operations when one operation on an element of an input vector has a dependency on another operation performed on another element within the same input vector.