1. Technical Field
The present invention generally relates to the field of computer aided data analysis and, in particular, to highly specialized computers capable of processing two dimensionally structured data sets, generally referred to as images, that are known as Cellular Array Processors (CAP).
2. Discussion
In the field of image processing, the Cellular Array Processor is generally well-known as a type of computer system whose architecture is particularly suited for the task of image processing. Although the specific design may differ substantially between different implementation, the general architecture of the Cellular Array Processor is quite distinctive. Typically, a system will include a highly specialized array processor that is controlled by a control processor of conventional design. The array processor, in turn, is formed from a large number of elemental processors that are distributed as individual cells within a regular matrix. (This gives rise to the descriptive name "Cellular Array Processor".) The elemental processors are essentially identical and generally contain a function-programmable logic circuit and memory register. The programmable logic circuit is typically capable of selectively performing a limited number of primitive logic and arithmetic functions, such as "and", "or", "invert", and "rotate" on the data stored in its respective memory register in conjunction with data provided by the control processor. The control processor is linked to the elemental processors via a common instruction bus. Thus, all of the elemental processors operate separately, yet synchronously, in the performance of a common logical function on the data contained in their respective memory registers. (This is commonly referred to as Single Instruction, Multiple Data, or SIMD operation.)
Cellular Array Processor systems are particularly well suited for image processing applications, since the memory registers present in the cellular array permit the digital representation of the image to be mapped directly into the processor. Thus, the spatial interrelationship of the data within the two-dimensionally structured data set is intrinsically preserved. By directing the array processor to perform a selected sequence of SIMD logical operations corresponding to the performance of a desired image processing algorithm, the data at every point in the image can be processed essentially in parallel. Naturally, both the effective processing speed (the product of the number of instructions per second executed by an elemental processor and the number of elemental processors operating simultaneously) and the resolution of the image being processed can be increased directly by the use of additional elemental processors. In addition to the image processing, the cellular processors are well suited for matrix calculations also.
Although the Cellular Array Processor architecture is a relatively recent development within the more general field of computer aided data analysis, a substantial number of systems utilizing the architecture have been developed. While many of the systems were specifically designed for general application purposes, quite a number have been designed for considerably more specialized applications. Descriptions of a number of the general application systems can be found in S. F. Reddaway, DAP-A Distributed Processor, IEEE, Proceedings of the First Symposium on Computer Architecture, pp. 61-65 (1973), General Purpose Array Processor, U.S. Pat. No. 3,815,095 issued to Aaron H. Wester on June 4, 1974, K. E. Batcher, Array Processor, U.S. Pat. No. 3,979,728 issued to Stewart Reddaway on Sept. 7, 1976, The Massively Parallel Processor (MPP) System, AIAA, Proceedings of The Computers in Aerospace Conference 2, pp. 93-97 (1979), and Parallel Type Processor with a Stacked Auxiliary Fast Memories, U.S. Pat. No. 4,144,566 issued to Claude Timsit on Mar. 13, 1979. A number of the more specialized systems are described in Floating Point Arithmetic Unit for a Parallel Processing Computer, U.S. Pat. Nos. 3,701,976 issued to Richard Shivety on Oct. 31, 1972; Network Computer System, 4,065,808 issued to Herman Schomberg et al. on Dec. 27, 1977; and Scientific Processor, 4,101,960 issued to Richard Stokes et al. on July 18, 1978.
In each of these system implementations, a significantly different elemental processor design is used in order to tailor the array processors for their anticipated applications. This is principally due to the extremely wide variety of their possible applications and equally wide variety of subcomponents that can be utilized. However, a common feature of these elemental processors is that a high degree of component interconnection is used in order to optimize the elemental processor processing speed.
The particular disadvantage of using highly optimized elemental processor designs is that any significant change in the anticipated data processing application will require the elemental processors to be substantially redesigned in order to preserve the system's overall data processing capability and efficiency. This is a practical consequence of the fact that the subcomponents are too highly specialized and interconnected to allow any significant alteration or extension of the elemental processors' component composition.
The general purpose of the inventions disclosed in those patents and applications incorporated by reference and cross referenced above is to provide an Array Processor composed of Elemental Processors of a distinctly modular architecture design that can be particularly configured for a wide variety of data processing applications.
The Array Processor disclosed therein is comprised of a plurality of modular Elemental Processors, the modules being of a number of different functional types. These modules may be of such general functional types of memory and accumulator, with each type nominally including an input programmable logic circuit and a closely associated memory register. The modules of the Array Processor are associated so that the Elemental Processors are architecturally parallel to one another. The principal flow of data within the Array Processor, based on the simultaneous transfer of data words within the Elemental Processors, is thereby correspondingly parallel. The modules are also architecturally associated as functional planes that lie transverse to the Elemental Processors. Each functional plane is thereby comprised of an array of modules that are each otherwise associated with a separate Elemental Processor. Further, the modules of a functional plane are of a single functional type. This allows the data of a two-dimensionally structured data set, present within the Array Processor, to be processed identically and in parallel by a common logical operation as provided and performed by a functional plane.
The Array Processor is operatively connected to a Control Processor by an Array/Control Processor interface. This interface allows the control processor to direct the operation of and exchange data with the Array Processor.
A particular advantage of this approach is the high degree of design flexibility that is inherent in the modular Elemental Processor. Its design may be optimized for any particular data processing application through the selection of an appropriate number of each functional type of module. Since practically any image processing function can be reduced to a small number of basic data manipulation functions which may, in turn, be implemented in modules, the Array Processor can be optimized for almost any application.
Another advantage is that the modularity of the Elemental Processors allows an Array Processor to be made fault tolerant. This is accomplished by providing an appropriate number and type of spare modules in each of the Elemental Processors.
A further advantage is that there is a uniform array of memory registers at each level within the Array Processor. This allows a number of unique image and image analysis-related data sets to be simultaneously present within the Array Processor. Therefore, they are immediately present for use during the processing of an image.
Still another advantage is that modules having nearest neighbor data interconnections to other modules on the same array level and therefore between neighboring Elemental Processors, can be placed on a number of levels within the Array Processor. This allows the data sets present in the modules on those levels to be independently transferred across the array in either similar or different directions.
Despite all of its advantages, the computer architecture of the incorporated by reference documents can still be improved. For example, the SIMD nature of the operation of this and other cellular arrays does not lend itself to performing data dependent processing. In other words, the earlier cellular arrays are primarily designed to simultaneously perform a given operation on the data that may be stored in each elemental processor and that operation is usually carried out regardless of the value of the data in each module. There is very little control provided to enable each elemental processor to process the data as a function of the data itself, as compared to the SIMD approach where the processing in each elemental processor is a function of a single instruction rather than the data.
There exists a need for a cellular array architecture that permits it to do data dependent processing. For example, there are many occasions when it would be beneficial for the cellular array to be capable of performing floating point arithmetic functions. As will appear, a computer must generally have a high degree of data dependant branching capability in order to accomplish floating point arithmetic. Doing calculations on arrays of data in general will result in overflow in some data points, underflow in others, and in some, the right order of magnitude will result. Also, to add or subtract two numbers, one first has to adjust them to the same exponent. All of these arithmetic functions require dealing with the data on a pixel by pixel basis. Many of the known massively parallel cellular arrays do not readily lend themselves to performing this type of processing efficiently.
There are obviously still other applications such as skewing and remote neighbor communications where it would be desirable to control the processing in one module associated with the same elemental processor as a function of the detection of a preselected data value in another module.