A typical single-instruction-multiple-data (SIMD) processor has multiple processor units each having its own associated memory space. The processor units are simple processors unable to fetch or interpret instructions, and are controlled by a single control unit whereby the processor units act as slaves, performing at its request, arithmetic-logic operations. One advantage of this architecture is that more memory and processor units can be easily added to the computer.
An example of a SIMD processor is described in U.S. Pat. No. 5,956,274 ('274 patent) issued on 21st Sep., 1999 to Duncan G. Elliott, et al. In this architecture the processing units are placed within the memory, there being one processor unit per column of storage elements, each processor unit being directly coupled to the sense amplifier of each column, and whose output is coupled to the memory column decoder. Each processor element is a single bit processor element and is capable of processing serial data output from the memory column to which it is coupled and associated. The disclosed structure allows for higher bandwidth communications between the memory and processing elements, allowing for a much high processing throughout as processing is not limited by the ability to provide data to the individual processing elements.
There are however aspects to the disclosed architecture that hamper its ability to be widely implemented. First, the structure disclosed in the '274 patent implements a single row i.e. 1-D layout of processing elements. Second, processing elements are coupled to and associated with a single column of memory such that the processing elements in the '274 patent are only able to communicate with the column or columns of memory with which they are coupled and associated.
In applications such as the processing of image, including video, it is desirable to have a high bandwidth of data from the memory. It is further desirable to have access to numerous portions of memory, including those with which a given processing element is not associated. It is also advantageous to implement an array of processing elements i.e. a 2-D structure.
The tight integration of processing elements and memory as outlined in the '274 patent generally makes it difficult to provide for communications between a two dimensional array of processing elements. It is further difficult to provide for communications between a given processing element and the portions of memory with which it is not associated. A communication network that implements 1 to 1 communications links between a given processing element and all other processing elements and all portions of memory is not practical, even with multi-layer metallization technology as is found in current semiconductor processing. Therefore there is a need for a communications between processing elements and memory without requiring 1 to 1 links between elements and is implementable within a structure where processing elements are integrated in memory.
U.S. Pat. No. 5,056,000 (Chang) discloses a Multi-Instruction/Multi-Data Stream (MIMD) computer comprising a number of processors each having a local memory and connected to a global bus. One processor acts as a master processor and the others as slaves. Each of the slave processors is connected to a shared Multi-Access Memory (MAM) through an interconnection switch. The interconnection switch has a grid configuration and can be controlled so that any one processor can access any number of memory modules simultaneously, but each memory module can only be connected to one processor at a time. This configuration allows a single processor to simultaneously write data to multiple memory modules of the multiple access memory.