In a single instruction multiple data (SIMD) processing environment, providing parallel data streams for multiple processors requires effective coordination between memory storage devices and the multiple processing units. A common data cache, which is a memory cache shared by all processing elements, may be subject to performance degradation if multiple data streams for the SIMD processors are not well localized. Through not being well localized, the data may be disposed at various locations within the cache memory or may be inefficiently allocated within the cache memory. The performance degradation occurs when the cache has a high miss rate based on multiple data reads having to be unnecessarily executed and degradation occurs as significant amounts of data will be unnecessarily read multiple times. The degradation performance thereupon reduces performance quality.
The current approach for data caching with a SIMD processor is to serialize multiple data streams. This approach retrieves one data stream at a time through accessing the memory cache. Serializing the data avoids performance degradation associated with inefficiently localized data, but provides added computational expense of serializing the data access operations.
Therefore, a need exists for a method and apparatus that allows for efficient memory accesses in conjunction with a SIMD processor.