Parallel processing, for example that is implemented by a digital signal processor to optimize digital signal processing applications, tends to be intensive in memory access operations. For example, a digital signal processor can operate as a single instruction, multiple data (SIMD), or data parallel, processor. In SIMD operations, a single instruction is sent to a number of processing elements of the digital signal processor, where each processing element can perform a same operation on different data. To achieve high-data throughput, memory organization of DSPs having SIMD architectures (or other processor supporting parallel processing) support multiple, synchronous data accesses. In an example, a processor architecture may include a multi-banked memory interconnected by a memory interconnect network architecture to the processing elements, such that more than one data operand can be loaded for (accessed by) the processing elements during a given cycle.
The memory interconnect network architecture typically includes an interconnection network for every respective parallel data transfer. For example, if two parallel data transfers from the memory to the processing elements are needed to perform an operation, the memory interconnect network architecture implements an interconnection network for transferring a first data set from the memory to the processing elements and another interconnection network for transferring a second data set from the memory to the processing elements. Although existing memory interconnect network architectures for parallel processing have been generally adequate for their intended purposes, they have not been entirely satisfactory in all respects.