This invention relates to a video image processor, and more particularly to a programmable pipelined image processor for feature extraction (PIFEX).
Computerized vision systems require an enormous amount of data processing. This seems to be especially true for the low-level tasks in which the data are still in the form of an image. Often thousands of fundamental operations must be performed for each pixel (picture element), and typically there are around a hundred thousand pixels per image frame. Real time processing at a rate of 30 or 60 images per second therefore may require a processing speed of around 10.sup.10 operations per second. Conventional computer architectures (of the Von Neumann type) are not currently capable of approaching these speeds. The fastest Von Neumann computers are two or more orders of magnitude too slow for typical problems in real-time computer vision. There is therefore a need for a real-time image processor capable of feature extraction and other low-level vision analysis.
The solution to this large data processing need is generally thought to be some form of parallel processing, so that a large number of computational elements operating simultaneously can achieve the necessary rates. (For reviews of parallel processors see D. Etchells, "A Study of Parallel Architectures for Image Understanding Algorithms," in ISG Report 104 (R. Nevatia, Editor), pp. 133-176, University of Southern California, Oct. 19, 1983, and A. P. Reeves, "Parallel Computer Architectures for Image Processing," Computer Vision, Graphics, and Image Processing 25 (1984), pp. 68-88.) There are several ways in which the necessary parallelism can be achieved.
One way is to use a multiple-instruction stream multiple-data-stream (MIMD) system, which consists of many Von Neumann machines operating on different parts of the same problem and communicating their results to each other. Such a multiprocessor system may be appropriate for the high-level portion of powerful future vision programs. However, for the low-level portions of the vision task, such a system is not cost-effective. This is because low-level vision tasks contain computations that are performed almost identically over the entire image, and it is wasteful to use the full power of general-purpose processors to do these repetitive tasks.
Another type of parallel computer is the single-instruction stream multiple-data-stream (SIMD) system. In such a system arithmetic units for each portion of the image frame (perhaps each pixel) perform the same operations simultaneously under the control of a master processor. If there is an arithmetic unit for each pixel, such a system is fairly convenient to use and is very fast. However, the cost is high. For example, a parallel processor described by J. Strong, "NASA End to End Data System, Massively Parallel Processor," Goddard Space Flight Center, May 30, 1980, was built for NASA by Goodyear Aerospace and is possibly the most powerful computer of this type so far. It contains 16,384 arithmetic units, which occupy 2048 chips, and costs several million dollars. It is arranged as a 128-by-128 array, and can add the elements of one 12-bit array to those of another 12-bit array in 3.7 microseconds, which corresponds to 4.4.times.10.sup.9 operations per second.
Another approach is a pipelined-image processor, which processes the pixels sequentially as they are scanned usually, but not necessarily, at the normal video rate. The parallelism can then be built into the device so that it performs more than one arithmetic operation for each pixel. Some of these operations can be done simultaneously on corresponding pixels in parallel data paths, and some can be done in a pipelined fashion in which one operation is being done on one pixel while the next operation is being done on the previous pixel, which already has had the first operation performed on it. Also no time is spent decoding instructions while this processing is going on, because the same operations are performed over and over, at least for one frame interval, and no access time for the data is needed.
This pipelined type of system can be far less expensive than an SIMD system, because it requires a number of processing elements depending on the number of steps in the algorithm instead of depending on the size of the image, and the former is usually a few orders of magnitude less than the latter. It usually is not as fast as an SIMD system, but it can process an entire image in one frame interval (normally 1/30 second or 1/60 second), and is thus suitable for real-time applications. If the number of steps in the algorithm exceeds the number of processing elements, separate passes can be made to complete the algorithm. This requires extra frame intervals and perhaps additional time for reprogramming the device.
Pipelined-image processors have been built in the past. However, they are very restricted in the kind of computations that they can do. They do not include the full range of desired computations, and what they do include often is not fully programmable. Furthermore, their computational power falls short of what is needed for many tasks. What is desired is a programmable system that will perform elaborate computations whose exact nature is not fixed in the hardware and that can handle multiple images in parallel.
The main problems in designing such a programmable pipelined system are in choosing a set of fundamental operations that are sufficiently general and that can be implemented in the desired quantity at a reasonable cost, and finding a practical way of interconnecting these operators that allows sufficiently general programability.