1. Field of the Invention
The present invention generally relates to a method of performing matrix multiplication and, in particular, to an optimized procedure for performing matrix by matrix multiplication in a highly parallel architecture (digital or optical) computer.
2. Description of the Prior Art
Matrix multiplication arises frequently as a necessary computational step when performing multi-dimensional image, signal, and data processing. Where real-time response is required, such as in high frame rate image and high frequency signal processing, it is essential that the computational intensive matrix multiplication be performed efficiently. When real-time constraints are present, the typical resort is to highly specialized computers capable of processing two-dimensionally structured data sets, such as matrices, that are often referred to as images. These computers include the digital cellular array processors (CAP) and optical processors.
In the field of digital image and data processing (or more generally referred to as simply image processing), the Cellular Array Processor is generally well known as a type of computer system whose architecture is particularly suited for the task of image processing. Although the specific design may differ substantially between different implementations, the general architecture of the Cellular Array Processor is quite distinctive. Typically, a system will include a highly specialized Array Processor that is controlled by a Control Processor of conventional design. The Array Processor, in turn, is formed from a large number of elemental processors that are distributed as individual cells within a regular matrix. (This gives rise to the descriptive name "Cellular Array Processor".) The elemental processor are essentially identical and generally contain a function-programmable logic circuit and memory register. The programmable logic circuit is typically capable of selectively performing a limited number of primitive logic and arithmetic functions, such as "and", "or", "invert", and "rotate" on the data stored in its respective memory register in conjunction with data provided by the Control Processor. The Control Processor is linked to the Elemental Processors via a common instruction bus. Thus, all of the elemental processors operate separately, yet synchronously, in the performance of a common logical function on the data contained in their respective memory registers. (This is commonly referred to as Single Instruction, Multiple Data, or SIMD operation.)
Cellular Array Processor systems are particularly well suited for image processing applications, since the memory registers present in the cellular array permit the digital representation of the image to be mapped directly into the processor. Thus, the spatial interrelationship of the data within the two-dimensionally structured data set is intrinsically preserved. By directing the Array Processor to perform a selected sequence of SIMD logical operations corresponding to the performance of a desired image processing algorithm, the data at every point in the image can be processed essentially in parallel. Naturally, both the effective processing speed (the product of the number of instructions per second executed by an Elemental Processor and the number of Elemental Processors operating simultaneously) and the resolution of the image being processed can be increased directly by the use of additional elemental processors.
The field of optical image processing is a relatively new and fast developing field. Optical processors utilize light intensity and frequency to represent an entire spectrum of possible data values. Physical phenomena such as birefringence, refraction, photon generation, and scattering effects are used to manipulate or modify the optical data values. Consequently, optical data processors are eminently well qualified for performing a diverse set of operations on optically represented image data sets.