This invention relates to digital computers capable of parallel processing. There has been little progress in applying parallel processing to the computationally intensive applications typically found in engineering and scientific applications, particularly to parallel processing of the same job (i.e., the same instructions and data).
In such applications, it is typical to find repetitive accesses to memory at fixed address intervals known as strides. Each new access initiated by a processor is for a memory location separated from the last access by the length of the stride. A stride of one means that the processor accesses every word (whose length may very) in sequence. A stride of two means that every other word is accessed. If interleaved memory elements are accessed by the processors, the stride determines a unique sequence of memory accesses known as the access pattern (e.g., for four memory elements, the access pattern might be ABCD).
Caches have long been used in digital computers, and have been applied to parallel processing, with one cache assigned to each processor. A cache is a high-speed memory containing copies of selected data from the main memory. Memory accesses from a processor come to the cache, which determines whether it currently has a copy of the accessed memory location. If not, a cache "miss" has occurred, and the cache customarily stops accepting new accesses while it performs a main memory access for the data needed by the processor.