There are many data processing applications which require the processing of enormous amounts of data. In many of these applications, the data to be processed is received as a sequential stream of logically adjacent data samples. One example of this type of data is machine vision data.
Machine vision is a specific type of industrial automation technology which extracts data from video images and makes this information available for process control and/or quality control.
Much of the underlying technology used for machine vision is shared with other fields. CCD video camera technology is used in camcorders and surveillance cameras. Digital image capture hardware is used for desktop publishing and multimedia applications. Image analysis software such as the public-domain software application program "Image" available from the National Institute of Health (NIH) in Bethesda, Md., is used for scientific image analysis. Image analysis algorithms have received extensive attention in academic research. Many companies, including the assignee of this invention, have built machine vision systems by combining these inexpensive and readily available components.
Machine vision technology is utilized in a number of applications. There are four (4) major machine vision applications namely: 1) inspection applications; 2) dimensional measurement; 3) object location (for guidance or parts placement); and 4) part identification. These applications are useful in, for example, industrial assembly (robotics) applications and product inspections. For example, machine vision may be utilized to determine the position of a base part relative to a reference point.
Similarly, other machine vision applications include determining whether or not product packaging is intact, whether integrated circuit component leads are in their proper location and properly shaped, product logo or label registration and correctness, and finished product inspection itself.
A vision system camera gathers a significant amount of information which must be accurately and rapidly processed in order to render a decision regarding the inspection performed. Machine vision data is generally presented as a sequential stream of logically adjacent and related data samples. Such volume and type of data presents a significant hurdle to rapid processing given the sheer volume of information and the relatedness of the data samples.
Software implementations and most hardware implementations of prior art machine vision data processing techniques involve processing one pixel of machine vision data (one data sample) at a time. Fixed length digital codes representing unsigned integers (or bytes or binary vectors) which are packed into words in a computer system's memory typically represent machine vision image pixels. In many prior art systems, 1 to 4 pixels of data (one data sample) is packed into each memory word but these pixels (data samples) are processed one at a time by a program running on the computer. Present image processing systems also benefit from the computer's memory cache that is, the computer typically fetches entire cache lines into its fast internal cache memory, thereby minimizing the time spent accessing external memory. Once a cache line of image data is located in the computer's internal cache memory, the computer may quickly access the individual pixels for processing.
The primary advantage of these prior art systems is that they are very flexible and may be easily adapted to different applications. The primary disadvantage, however, is that these computer systems perform operations sequentially so that the overall speed of the system is directly proportional to the speed of the computer. For many present machine-vision applications, the general-purpose computers available today can not run fast enough to meet the application requirements.
Accordingly, one of the primary challenges presented by machine vision is integrating adequate computation hardware. General-purpose CPU's and image processing hardware designed for other applications cannot meet the price/performance ratio required for many machine vision applications.
A number of devices have been developed to accelerate computations on digital image data. These specially-designed processors perform specific computations on an image data stream much faster than a general-purpose CPU of a similar size and cost. One example of such devices is a pipeline processor. Previous image pipeline processing architectures process image pixels in raster order at high speed, receiving and, for many operations, generating one new pixel each clock pulse. These devices are often used in applications where they are connected directly into the stream of video data generated by the camera.
These pixel-pipeline processors can be useful for machine vision, but they are not optimal for many machine vision applications. For a machine vision system, the required speed is a function of the application, not the video data rate. Many machine vision applications require data rates that are substantially faster or slower than the data rate of a standard video camera or other source of image data, either analog or digital. Other issues limiting the usefulness of pixel-pipeline processors for machine vision are a long processing latency and the fact that such pixel-pipeline processors often require substantial re-configuration effort to switch processing operations.
Several devices have been invented specifically to perform machine vision computations in a cost-effective manner. Some systems use a vision coprocessor which operates directly on the image data stored in the memory banks controlled by the primary CPU, much the same way an Intel 8087 or a Motorola MC68881 performs floating-point computations on data stored in the memory banks of the Intel 8088 or the Motorola 68000 respectively. These systems achieve a cost advantage by sharing a single memory controller between the CPU and the coprocessor, but pay a performance penalty because the CPU is intimately and extensively involved in controlling the coprocessor.
Another existing machine-vision system architecture uses a large number of simple vision processors which operate on individual pixels or multiple rows or columns of image data in parallel. The number of processors may be expanded to build a very fast system, but the parallel memory and interconnection circuitry is relatively complex and expensive.
Another existing machine-vision system architecture uses digital signal processing devices (DSP's). These devices may be programmed to perform operations required for machine vision, but they are better at one-dimensional signal processing required for audio or modem applications. Another disadvantage of DSP based systems, both in vision and non-vision applications, is that although they may be programmed to perform many different operations, careful assembly code optimization and detailed knowledge of the DSP instructions are required to achieve optimal performance. In addition, the DSP engines typically require expensive, high-speed memory, resulting in an expensive data or vision processing system.
Another technique used in existing machine-vision system architectures is to use look-up tables indexed by pixel values or simple functions of pixel values to implement normalized correlation and other computations required for machine vision. The advantages of using look-up tables are that they may be reprogrammed to implement different operations, they may perform non-linear operations, and they may be implemented using readily available memory components. The disadvantages of look-up table implementations of computations include the performance impact of re-loading one or more look-up tables to change operations and the limited range of computations that may be performed in this fashion.