1. Field of the Invention
This invention relates in general to computer vision or image understanding machines and, more particularly, to computer architectures and methods capable of both arithmetic or iconic and symbolic processing of image data.
2. Description of the Related Art
There exists a need for a computer system that is capable of both arithmetic or iconic and symbolic processing of image data which is designed specifically for use in computer vision research and analyses efforts. Such a system can be used in a variety of different applications such as for use in real time processing of data from radar, infrared and visible sensors in areas such as aircraft navigation and reconnaissance. Another application of such a machine would be as a development system for use in vision laboratories in the implementation and simulation of many computationally intensive algorithms.
Machines which are capable of operating on image data (as compared to merely arithmetic data) are referred to alternatively as image processors, vision computers, image understanding machines and the like. Image understanding machines are sometimes thought of as a higher level machine than an image processor which is typically referred to as a machine for enhancing and classifying images, whereas an image understanding machine involves the automatic transformation of the image to symbolic form, effectively providing a high level description of the image in terms of objects (i.e., a connected set of pixels containing information), their attributes, and their relationship to other objects in the image. The present invention is directed to this latter type of machine (although it can do the lower level tasks as well) which shall be referred to as an image understanding machine.
It is generally recognized that a high level image understanding machine must be capable of performing two basic types of computations: arithmetic or iconic processing and symbolic manipulation. Thus, it would be desirable to provide an image understanding machine that is capable of performing a number of visual information processing algorithms.
It should be noted that future algorithmic developments will be a continually and rapidly evolving activity resulting from changing applications, advances in sensor and solid state technologies, and the need for added intelligence to deal even more rapidly and effectively with ever increasing amounts of raw data.
Many of the known concurrent or parallel processing computer architectures are not specifically intended to be used for image understanding purposes. Other image processing systems also suffer from the inability to efficiently perform both numeric and symbolic computations. For example, some of the prior architectures do not lend themselves to efficiently execute various artificial intelligence techniques such as frames, rules and evidential reasoning, while at the same time being capable of efficiently doing more iconic related image processing algorithms. One of the major drawbacks in the prior computer architectures was that their designs generally necessitated the transfer of large amounts of data between a host computer and the special purpose vision computer, and, in parallel processing environments using a plurality of processing levels, between a lower and a higher processing level of processing elements.
Unfortunately, the transfer of data and instructions in the known architectures resulted in relatively slow operational speed. It is, of course, one of the ultimate objectives in any computer system to increase the speed of operation without unduly increasing costs or complexity of operation.
As noted above and by way of background, architecture for an image understanding machine is disclosed herein in the aforementioned U.S. Patents for performing both iconic and symbolic operations on image data in the form of a matrix of pixels. Such machines include a first level of image processing elements for operating on the image matrix on a pixel per processing element basis. Each processing element of the first level is adapted to communicate with each other. A second level of processing elements is provided for operating on a plurality of pixels associated with a given array of the processing elements of the first level. Each second level processing element is associated with a group of first level processing elements and communicates therewith as well as with other second level processing elements. A third level of processing elements is provided for performing such functions as instructing the first and second level of processing elements. It is also designed to operate on a larger segment of the matrix than the second level processing elements. Each third level processing element is associated with a given number of second level processing elements and communicates therewith as well as with other third level processing elements. A host computer communicating with at least each third level processing element is provided for performing such functions as instructing the third level processing elements.
This computer architecture is designed to solve the problem of the disparities in granularity from iconic processing to symbolic processing. By the term "granularity" it is meant that the processing power of each processing element at a given level is comparable to the area (i.e., grain size) of the image segment associated with it. The larger the grain size the more powerful the processing element becomes.
This architecture in general provides an efficient implementation match at each level of granularity. Thus, for iconic processing which requires the smallest granularity, a processor per pixel approach is provided (i.e., the first level processing elements) to efficiently perform these tasks. On the other hand, for higher level or more sophisticated operations, the third level processing elements are provided which can be implemented in the form of general purpose microprocessors.
The computer architecture provides parallelism at substantially all levels of computation. Thus, bottlenecks which are often associated with serial computations or communications are avoided.
Computation of the moments of an object is difficult because it involves a large number of potential pixels, which are configured in irregular patterns. Most parallel computers can do the calculations internal to the summation steps simultaneously in parallel O(1) time. However, the summation process then requires that the data values be collected from over the object area, which is a more difficult process and cannot be done in O(1) time. The summation on parallel mesh machines usually involves accumulation of data values using shifting operations to bring data together, an O(L.times.N) operation for L times N pieces of data.
The existing technique requires the higher level processing elements to scan through all of the two-dimensional array of data to extract the symbolic information from the lower level processing elements.
Assuming that there are M objects in an L by N image, in a parallel architecture as described herein, all objects in the image can be processed in parallel. The scan time for this operation is proportional to the image size and is therefore very large for even moderate sized image planes, i.e., it takes O(L.times.N) (i.e., L times N) time to scan the results and to detect the M objects in the image by the higher level processing elements. However, since the time to extract the symbolic information is proportional to the number of objects in the image, this number is very small compared to the image size. Since only the M objects are of interest, it is a disadvantage that the higher level processing elements have to step over the non-object containing areas to locate the objects embedded in the iconic data.
The disadvantage of the prior art to compute the moments of an object in an image is that the summation on parallel mesh machines usually involves accumulation of data values using shifting operations to bring data together, an O(L.times.N) operation, where L and N are the dimensions of the object area in pixels, independent of the number of objects in the image and the shape of the object. For the worst case, L.times.N could be the size of the image.