Sparse data sets need to be processed in a variety of computational applications. For example, machine learning, computer modeling and data mining, amongst others, rely on the generation and efficient processing of sparse data sets. Sparse data sets are often stored and/or processed as a two-dimensional array of allocated memory in a computer.
While graphics processing units (GPUs) were originally designed to accelerate the rendering of graphical images for output to a display device, increasingly GPUs are being used for non-graphical calculations. Because of the parallel nature of many matrix calculations, modern GPUs, or other types of stream processors, are oftentimes particularly suited to perform such calculations on matrices.
Processors, including GPUs, oftentimes include one or more layers of cache memory. Cache memory is intended to speed up processor memory access by moving data that is likely to be accessed to quicker memory hardware. One such example of quicker memory hardware is static random-access memory (SRAM). Where data requested by the processor is not in a particular cache memory level, the request is said to be a cache miss. Cache misses may slow down the overall computation being performed by a processor. Due to their sparse nature, processing of sparse data sets on GPUs or other types of stream processors may be particularly prone to cache misses in certain instances.
It would be desirable to address these issues.