1. Field of the Invention
The present invention relates to a high speed parallel data processing system and, more particularly, to a parallel data processing system comprised of an array of identical interconnected cells operating under the control of a master controller to operate on single bits of data. Even more particularly, the invention pertains to the architecture of the cells and to specific patterns of interconnection.
2. Background Art
Certain data processing tasks require that substantially identical logical or arithmetic operations be performed on large amounts of data. One approach to carrying out such tasks which is drawing increasing attention is parallel processing. In parallel processing, each element or cell of an array processor made up of such cells processes its own bit of data at the same time as all other cells of the array processor perform the same process on their own bit of data. Such machines are referred to by several names, including Single Instruction-Multiple Data (SIMD) machines.
A common arrangement for such a machine is as a rectangular array of cells, with each interior cell connected to its four nearest neighboring cells and each edge cell connected to a data input/output device. Each cell is connected as well to a master controller which coordinates the movement of data through the array by providing appropriate instructions to the processing elements. Such an array proves useful, for example, in high resolution image processing. The image pixels comprise a data matrix which can be loaded into and processed quickly and efficiently by the processor array.
Although all may be based upon the same generic concept of an array of cells all performing the same function in unison, parallel processors vary in details of cell design. For example, U.S. Pat. No. 4,215,401 to Holsztynski et al discloses a cell which includes a random access memory (RAM), a single bit accumulator, and a simple logical gate. The disclosed cell is extremely simple and, hence, inexpensive and easily fabricated. A negative consequence of this simplicity, however, is that some computational algorithms are quite cumbersome so that it may require many instructions to perform a simple and often repeated task.
U.S. Pat. No. 4,739,474, to Holsztynski et al, represents a higher level of complexity, in which the logic gate is replaced by a full adder capable of performing both arithmetic and logical functions. Pressing the full adder into dual service creates an efficiency which more than offsets the added complexity and cost incurred by including a full adder in each cell.
It is important to note that the substitution of a full adder for a logic gate, while superficially simple, is in reality a change of major consequence. The cell structure cannot be allowed to become too complex. This is because in a typical array the cell will be repeated dozens if not hundreds of times. The cost of each additional element in terms of money and space on a VLSI chip is therefore multiplied many times. It is therefore no simple matter to identify those functions which are sufficiently useful, to justify their incorporation into the cell. It is similarly no simple matter to implement those functions so that their incorporation is not realized at too high a cost.
Parallel processors may also vary in the manner of cell interconnection. As mentioned above, cells are typically connected to their nearest physical neighbors. All cells except those at the edge of the entire array are connected to four neighbors. It has not heretofore been completely appreciated, however, that significant benefits may flow from providing for alternate paths of interconnection and, specifically, in providing programmable, flexible interconnection between cells.