Conceptually, a two dimensional matrix multiplier circuit functions through use of a multiplier and an accumulator. Recently, distributed arithmetic has been utilized to simplify the design of the multiplier circuit and increase the speed of computation. A multiplier circuit which utilizes distributed arithmetic takes advantage of both pipeline processing and parallel processing. When there is an appropriate match between pipeline processing and parallel processing, throughput is increased. It is an object of the present invention to provide a matrix multiplier circuit based on distributed arithmetic, in which throughput is increased by providing an appropriate match between pipeline processing and parallel processing.
In the field of digital signal processing, matrix multiplication is used extensively to transform a set of data into various domains. For example, the discrete cosine transform, the inverse discrete cosine transform, and the discrete Fourier transform utilize matrix multiplication.
Consider, for example, the matrix multiplication EQU X.times.B=C (1)
where
X is an input data matrix, PA0 B is a transformation matrix whose elements are constant transform coefficients, and PA0 C is the output data matrix.
The data matrix X is of size I rows.times.J columns. The transformation matrix B is of size J rows.times.K columns. The output matrix c is of size I rows.times.K columns. Thus, the matrix X is comprised of the elements x.sub.ij i=1,2, . . . ,I, j=1,2, . . . ,J. The matrix B is composed of the elements b.sub.jk, k=1, . . . ,K. The matrix C is composed of the elements c.sub.ik.
Equation (1) can be rewritten as ##EQU1##
The conventional matrix multiplier circuit which utilizes distributed arithmetic multiplies one row of J elements x.sub.ij with K columns of elements b.sub.jk to obtain one row of elements c.sub.ik. In the conventional matrix multiplier, pipelining is controlled by the precision (i.e. the number of bits) M in the elements x.sub.ij, whereas parallel processing is determined by the size J of the X and B matrixes. When M is larger or smaller than J, the pipeline processing and parallel processing in the multiplier circuit are not well matched. It is an object of the present invention to provide a matrix multiplier circuit which utilizes distributed arithmetic and which takes more complete advantage of pipeline and parallel processing to achieve a higher throughput.