1. Field of the Invention
This invention pertains to parallel or systolic data processors comprised of individual bit serial processors. It pertains more particularly to the internal architecture of the individual bit serial processors and reconfigurable patterns of connection of the individual bit serial processors to each other as well as to external devices.
2. Background Art
The traditional way to perform repetitive logical and arithmetic operations on large quantities of data involves the use of a very fast computer operating on the data one piece at a time in series until a final result is obtained. This approach has the considerable merit that such a computer can be as versatile as its programming permits. It has the drawbacks, however, that very fast computers tend to be expensive, and even the fastest can take an appreciable amount of time to complete calculations for very large quantities of data.
If the operations to be performed are sufficiently repetitive, however, it may be possible to perform them more efficiently and quickly by dividing the data into bits of equal significance and using a large number of interconnected bit serial processors each operating on its own bit in parallel with all other bit serial processors. Machines based on this approach are known by various names, such as single instruction, multiple data or SIMD machines, parallel processors, and systolic processors. They are typically comprised of an array of identical bit serial processors or "cells", each connected to its nearest neighbors. The array may, for example, be a rectangular matrix of n columns and m rows, so that all cells except those on the edge have four physically nearest neighbors. The cells all operate under the control of a master controller to execute the same instruction concurrently on respective bits of equal significance.
A parallel processor's advantage in speed is for obvious reasons directly proportional to the number of cells comprising the array. The greatest advantage thus theoretically results from having a very large number of cells. This is almost always impracticable, however, because cost is also directly proportional to the number of cells. Nevertheless, arrays of scores and even hundreds of cells have been proposed and constructed. There is therefore an incentive to limit the cost per cell, and a similar incentive to limit the area each cell occupies on a semiconductor chip. These goals have been met in the past by keeping cell architecture as rudimentary as possible, implementing only basic functions and perhaps such additional functions as would prove especially useful in a given application. As a consequence, architectures of cells and arrays have been either dedicated to a particular application and unsuitable for any other, or so general purpose as to be inherently inefficient. Expansion and efficiency have been limited by the cell's computational power, bandwidth, memory, and set patterns of interconnection.