Matrix multiplication or matrix product is a binary operation that produces a matrix from two matrices with entries in a field, or, more generally, in a ring or even a semi-ring. The matrix product is designed for representing the composition of linear maps that are represented by matrices. Matrix multiplication is thus a basic tool of linear algebra, and as such has numerous applications in many areas of mathematics, as well as in applied mathematics, statistics, physics, economics, and engineering. In more detail, if A is an n×m matrix and B is an m×p matrix, their matrix product AB is an n×p matrix, in which the m entries across a row of A are multiplied with the m entries down a column of B and summed to produce an entry of AB. When two linear maps are represented by matrices, then the matrix product represents the composition of the two maps.
Computing matrix products is a central operation in all computational applications of linear algebra. Its computational complexity is O(n3) (for n×n matrices) for the basic algorithm (this complexity is O(n2.373) for the asymptotically fastest known algorithm). This nonlinear complexity means that matrix product is often the critical part of many algorithms. This is enforced by the fact that many operations on matrices, such as matrix inversion, determinant, solving systems of linear equations, have the same complexity. Therefore various algorithms have been devised for computing products of large matrices, taking into account the architecture of computers.
Matrix multiplication is at the heart of all machine learning algorithms and is the most computationally expensive task in these applications. Most machine learning implementations use general-purpose CPUs and perform matrix multiplications in serial fashion. The serial computations in the digital domain together with limited memory bandwidth sets a limit on maximum throughput and power efficiency of the computing system.