This invention relates to systems and methods for the processing and analysis of spatially related data arrays such as images, by means of a large array of programmable computing elements.
A number of systems have been developed which employ a large array of simple bit serial processors, each receiving the same instruction at any given time from a central controller. These types of systems are called "Single Instruction Multiple Data" (SIMD) parallel processors. There are various methods for communicating data from one processor to another. For example, the massively parallel processor described in K. E. Batcher, "Design of a Massively Parallel Processor," IEEE Transactions on Computers, Setp. 1980, pp. 836-840, contains an array of 128.times.128 processors where image processing is an important application. Data is communicated between neighboring processing elements when an instruction that requires a neighborhood operation is performed. Image data arrays with dimensions larger than 1024.times.1024 are not uncommon. Since processor arrays this large are not economically feasible, the array must be broken into smaller data array sizes with dimensions equivalent to the size of the processor array. There are other types of SIMD processors, but they also generally experience the problem of data arrays larger than the processor array. Generally, for all these systems, all the memory associated with the processors is not large enough to hold the entire image along with extra memory capacity for intermediate computational results.
Thus, a large external memory is necessary, and mechanisms must be able to handle the input and output of small subarray segments at high speed to preserve computing efficiency. Even if enough memory were supplied to each processor, so that the total memory associated with the ensemble of processors could contain the entire large array of image data, there would still remain the problem of communicating data between the various subarrays when neighborhood operations are performed. During an instruction clock cycle, every processor receives the output of its associated memory, so that processors on the edges of the array cannot receive data from neighboring subarrays because all memories are already engaged in reading an entire subarray. Thus, multiple clock cycles would be needed in reading data when subarray and neighboring subarray data are both needed in a computation. Generally, SIMD processors are less efficient in handling global processes where large areas of the data matrix must be analyzed, such as in histograms, feature extraction, and spatial transforms, such as the Hough transforms, and Fourier analysis.
Indirect addressing is an important processing concept, but the difficulties with implementing it in a parallel processing environment have been recognized in the literature. See for example: A. L. Fisher and P. T. Highnam, "Real Time Image Processing on Scan Line Array Processors," IEEE Workshop on Pattern Analysis and Image Database Management, Nov. 18-20, 1985, pp. 484-489; and P. E. Danielson and T. S. Ericsson, "LIPP-Proposals for the Design of an Image Processor Array", chap. 11, pp. 157-178, COMPUTING STRUCTURES FOR IMAGE PROCESSING (Ed. M. J. B. Duff, Academic Press 1983). Large amounts of memory are required for indirect addressing to be useful because applications such as look-up-tables or histograms which can benefit from indirect addressing also require a large amount of memory. In SIMD processors the memory is generally integrated on the same chip as the processor, but technology limits the integration of both processors and memory on one chip so that the memory is too small for indirect addressing to solve any useful problems using these technologies. However, if the memory is outside the chip, then for a large number of integrated processors on a chip, there are too many address lines that the processors must handle, so that the number of signal paths is a strong limiting factor.
Since all processors simultaneously perform the same instructions in SIMD processor arrays, it has been recognized that a method is required to prevent some selected processors from performing the instructions, according to data values within the associated memory. Usually, a memory write inhibit function is used where a programmable flip-flop controls the the write function for each memory in the array. However, the write inhibit function requires an extra line from the processor chip to the associated memory chip. Because of output pin limitations on the circuit chips, not too many processors can be integrated on a single chip. Also, cost effective byte-wide memories could not be utilized because the eight separate data lines cannot be separately inhibited.
Therefore, a primary object of the present invention is to provide a simple method to allow a fixed array of processors to handle a large array of data while performing operations which require neighborhood and global processing of data.
Another object of the invention is to provide an effective method of indirect addressing of memory which operates independently for each SIMD processor in the array.
A further object of the invention is to provide a means of handling large arrays of data without resorting to memories and associated input and output mechanisms remote from the processing array.