A matrix is a rectangular array of numbers, symbols, or expressions that are arranged in rows and columns, and individual items in a matrix are commonly referred to as elements or entries. Matrices are often used to represent linear transformations, that is, generalizations of linear functions such as f(x)=ax. As such, matrices can be used to project three-dimensional (3D) images onto a two-dimensional (2D) screen, to perform calculations used to create realistic-seeming motion, and so on and so forth. A sparse matrix is a matrix populated primarily with zeros, whereas a dense matrix is a matrix where a significant number of elements (e.g. a majority) are not zeros. Sparse matrices are useful in various application areas such as, for example, network theory where it is common to have a low density of significant data or connections represented by non-zero values interspersed throughout a far greater number of zero values.
Over the past forty years, the sizes of sparse matrices has grown exponentially by nearly four orders of magnitude, a rate that far outpaces the growth in DRAM associated with commodity central processing units (CPUs) and graphical processing units (GPUs), creating substantial challenges for storing, communicating, and processing. In response to this uneven growth, several compressed formats for representing sparse matrices have been proposed over the years and several are commonly used today. However, these formats tend to be CPU-centric and operate on word (e.g., 32-bit) boundaries. Moreover, different structured and unstructured matrices may be expressed in different formats, thus requiring additional CPU (i.e., processing) resources for translation between these different sparse matrix formats.
Meanwhile, the increased capacity and improved performance of field-programmable gate arrays (FPGAs) has opened the door towards customizable, reconfigurable processing capabilities for mission-critical applications in high-performance computing (HPC). Characterized by flexibility and performance, FPGAs provide a potent option for hardware approaches that can be implemented and operated as flexible software libraries. To facilitate adoption and maximize performance gains, a properly designed FPGA library for HPC would ideally provide high performance, support arbitrarily large data sets, and use minimal or no reconfiguration for different problem parameters or input formats. In addition, application-specific integrated circuits (ASICs) also provide solutions for larger quantity implementations where FPGAs might be utilized in smaller quantity utilizations.