The present invention relates generally to vector processors, and more specifically to a vector processor having improved data loading circuitry.
Conventional vector processors include a main memory in which instructions and vector data are stored, a memory controller that accesses the memory, a calculation circuit to perform vector calculations on data read out of the memory, and an execution controller which interprets instructions from the memory and monitors available resources, or vector registers to control the pipelined components of the processor so that vector calculations are performed at high speeds. The operating performance of a conventional vector processor can be evaluated by executing the following vector instructions:
______________________________________ VLD V1 (1) VADD V2 .rarw.S+V1 (2) VLD V3 (3) VMPY V4 .rarw.V2.times.V3 (4) ______________________________________
Instruction (1) is a "vector load" instruction that directs the loading of vector data from the main memory into a vector register V1, instruction (2) is in addition instruction that directs the summing of vector data in register V1 with a scalar value S and the loading of the result into a vector register V2, and instruction (3) is a second vector load instruction that directs the loading of vector data from the memory into a register V3. Instruction (4) is a multiplication instruction that directs multiplication of data in register V2 with data in register V3 and directs the storing of the result into a register V4. FIG. 1 depicts a series of events involved with the execution of instructions (1) through (4) using the prior art vector processor. The execution controller of the processor first analyzes instruction (1) as it is loaded into the instruction register and directs the memory controller to access the memory to load vector data into register V1. As the vector register V1 is being loaded, the execution controller directs the calculation circuit to execute instruction (2) by adding a scalar value S to the data loaded into register V1 and loading the result into register V2. In response to instruction (3), the execution controller directs the memory controller to access the memory to load vector data into register V3 and waits until the result of the addition is loaded into register V2, whereupon it urges the calculation circuit to execute instruction (4).
However, if it takes a substantial amount of time for the memory controller to access vector data in the memory, the calculation circuit would have to wait until the data is ready for calculation since the vector load instruction is not issued to the memory controller until the next instruction is supplied to the calculation circuit. Therefore, a system slowdown can occur as a result of access delays.