The present disclosure relates to sparse matrix multiplication, and more specifically, to sparse matrix multiplication using a single field programmable gate array.
The aim for energy-efficient computer systems has spurred the trend towards heterogeneous computing and massively parallel architectures. Heterogeneous systems use graphics processing units (GPUs) and other types of co-processors to accelerate application hotspots.
One application hotspot includes sparse matrix-dense-matrix multiplication (SpMM), which is an arithmetic operation common to many computing applications and algorithms. SpMMs are used in stochastic matrix estimator (SME) algorithms that are applied in a diverse set of problems, such as, for example, material science problems concerned with finding the electronic density in Density Functional Theory (DFT), or in cognitive computing problems concerned with finding the importance of a node in a knowledge graph.
In conventional modern data centers systems, the SpMM operation may not perform optimally on current state-of-the-art hardware due in part to the speed of floating point operations per second. For example, floating point operations may be performed at a low single-digit percentage of peak-performance of a computing device in state of the art computing. Meanwhile, the energy consumption of the device is typically around 50-75% of its peak consumption. Consequently the SpMM operation may be extremely energy inefficient, and thus, may pose a significant financial challenge for large data-centers solving a large number of cognitive problems on expansive graphs.