Vector processors are designed to operate on arrays of information in an extremely fast manner. These vector processors typically include a vector register file that is a data store, which is coupled to arithmetic units which operate on the data stored in the vector register file. Data is loaded into the vector register file from a memory, such as a cache memory. The cache memory can also receive and store data from the vector register file. The cache memory itself receives data from a large, main memory, which is typically coupled by a bus to the vector processor. When data is to be loaded into the vector register file, a load/store pipeline accesses the cache memory for the data elements and sends them over a bus to the vector register file.
In scalar processors, the problem of multiple cache misses can also occur which cause the pipeline to be dominated by the servicing of those misses.