Computing systems include one or more central processing units (CPUs) and memory (e.g., dynamic random access memory (DRAM), static random access memory (SRAM), etc.) to perform computational processes. The CPU may include generalized computational units and/or specialized computational units in order to perform data operations on data from the memory. A bus is provided between the CPU and the memory to allow the CPU to retrieve data for performing computational processes from memory and store computational results back into the memory. A memory controller is typically included in the memory and communicatively coupled to the bus to receive data access requests from the CPU and to provide data responses on the bus to the CPU. Some CPUs may provide for the memory controller to be integrated into the same package or chip as the CPU. In either scenario, the bandwidth of the bus governs the overall bandwidth of data transfers between the CPU and memory.
For high-value algorithms on large sets of data, the overall performance of these CPU-based processing systems is determined in large part by their memory bandwidth and memory access capabilities. However, there are limitations and costs associated with the memory bandwidths and memory access capabilities of a given traditional processing system. For example, irregular memory access patterns and a low ratio of flops to memory access are typical limitations of CPU-based processing systems when implementing algorithms on large sets of sparse data. As a result, traditional processing systems can suffer from poor locality of reference and deteriorations in memory access performance. Furthermore, a traditional processing system spends most of its resources on moving data rather than on performing computations. This not only hampers the performance of the processing system, but also causes the processing system to consume large amounts of energy on ancillary memory administration. Given the low energy requirements of modern electronic devices, the inefficiencies of traditional processing systems present significant design problems.