1. Field of the Invention
This invention relates to the field of data processing. More particularly, this invention relates to data processing systems having vector and scalar data processing registers.
2. Description of the Prior Art
A data processing instruction typically includes within it an opcode portion and one or more register specifying fields. In some systems a register may be treated as a vector register or a scalar register. A vector register specifies a sequence of registers each storing its own data value which is separately operated upon as the data processing instruction repeats its operation upon each data value in the sequence. Conversely a scalar register is a single register storing a single value that operates independently of other registers.
Data processing instructions using, vector registers have a number of advantages over purely scalar operations. The instruction bandwidth required may be reduced since only a single data processing instruction is required to specify a plurality of similar data processing operations to be performed (common in DSP functions such as FIR filters). In the case of a single-issue machine (i.e. one instruction is fetched and decoded each cycle), which is desirable because of its simplicity, higher performance can be achieved with multiple functional units that execute in parallel on different vector instructions.
FIGS. 16 and 17 of the accompanying drawings respectively illustrate a Cray 1 processor register bank and, a Digital Equipment Corporation MultiTitan processor register bank. Both of these prior art processors provide vector and scalar registers.
In the case of the Cray 1, separate vector and scalar register banks 10, 12 are provided. A 16-bit instruction provides individual opcodes that correspond to different combinations of the registers specified in the instructions being treated as vectors or scalars. This has the disadvantage that an increased number of opcodes need to the provided to represent these various combinations. Furthermore, as the scalar and vector registers are provided in separate register banks 10, 12, the opcode needs to be at least partially decoded in order to determined which of the register banks 10, 12 is to be used for a particular register specified. This additional decode requirement imposes difficulties in being able to read the data values stored in the registers as early as possible.
The Cray 1 processor uses 3-bit register specifying fields R1, R2, R3 allowing 8 scalar registers and 8 vector registers to be addressed. In practice, each vector register comprises a stack of registers that can each store a different data value and be accessed in turn in dependence upon a vector length value stored within a length register 16 and mask bits stored within a mask register 18. However, the limitation of only 8 scalar registers being allowed by the 3-bit register fields is a significant disadvantage for modem compilers that are able to produce faster code if able to target a higher number registers.
The MultiTitan processor provides a single register bank 20 in which each register may operate as a scalar or as part of a vector register. The MultiTitan processor uses a 32-bit instruction to specify its data processing operations. This large amount of instruction bit space allows the instructions themselves to include fields VS2, VS3 that specify whether the registers are vectors or scalars and to include the length of the vectors (Len). Whilst this approach allows a great deal of flexibility, it suffers from the disadvantage that in many circumstances sufficient instruction bit space is not available to enable vector/scalars fields to be included within the instruction without limiting the opcode space available to allow provision of a rich instruction set. Furthermore, the provision of the vector length within the instruction itself makes it difficult to make global changes to the vector length without having to resort to self-modifying code. The MultiTitan technique also rather inefficiently uses its instruction bit space as it devotes equal instruction bit space resources to combinations of vector and scalar registers that are in practice very unlikely to be used (e.g. V=S op S; a sequence of vector registers is filled with the results of an operation performed upon two scalar registers).