1. Field of the Invention
The present invention relates generally to pattern recognition and associative networks, and more particularly to automatic visual recognition. The invention specifically relates to image recognition of a predefined object by analyzing image data into occurrences of predefined elementary features, and examining the occurrences for a predefined combination that is characteristic of the predefined object.
2. Description of the Background Art
Pattern recognition of image data has used a variety of techniques. At a basic level, pattern recognition is a process of comparison or correlation with predefined patterns. This process of correlation or comparison has been carried out directly for tasks such as character recognition. Parallel associative memories have been developed for performing, in parallel, comparison operations with a multiplicity of stored patterns. It is also known that Fourier transform techniques can sometimes be used to perform correlations and convolutions with increased computational efficiency.
In a known method, video data are processed to emphasize a particular feature of an object, such as the edge of the object, before attempting to match the video data to the predefined patterns. This technique is successful for recognizing simple objects, but the computation requirements preclude real-time pattern recognition for many applications. Moreover, this technique is quite sensitive to noise and extraneous background objects.
Pattern recognition has also been performed using neural and associative networks An associative network, for example, consists of simple processors known as "nodes" that are interconnected by communication paths called "links." Nodes perform a number of primitive operations on information reaching them via links from other nodes. Specifically, nodes form a thresholded and range limited weighted summation of the current values of all source nodes. The weights used are associated with the source-to-sink links for each sink node. Performance of the preceding node operations produces a numeric "current node value" for each node at each instant of time. A node which currently has a non-zero node value is said to be "firing." It is these node values which are transmitted between nodes, with one time step required to send a value from a node to a directly connected neighboring node.
It is known that biological visual systems are rather efficient for pattern recognition of image data. In a biological system, light forming an image is focused on a retina, where it is absorbed by receptor cells (rods and cones). These cells transform the light information into neural signals that are sent to the brain. But while still in the retina, the receptor information is processed by ganglion cells, which perform a two-dimensional convolution. There is approximately one ganglion cell for every receptor location, but the ganglion's response is determined by a fuzzy spatial average of many nearby receptors. The ganglion responses are sent to the brain, where they are transformed by simple cells, which respond only to a large number of excited ganglion responses in a straight line, of a particular orientation, and at a particular retinal location. Complex cells add to the simple cell exclusivity the notion of being end-stopped; they respond only to a line segment, and not to an infinitely long line. The complex responses proceed to the rest of the brain, where perception occurs in a manner that is not yet understood.
Associative networks have been proposed for implementing low-level vision. Low-level vision has been defined as the domain that deals with the steps of the visual processing problem leading up to the construction of "symbolic" models of the content of a visual scene. Thus, low-level vision has been said to involve usually spatial filtering of the raw image data, intensity edge detection, motion detection across the visual field, image sub-pattern grouping into hypothesized objects, extraction of object distance through the use of binocular images, etc. It has been recognized that an operator such as a two-dimensional Gaussian filter may be viewed as a function which is convolved with the image data, or the filter function may also be considered to be applied locally to each pixel of the image in a "step and repeat" fashion that lends itself to associative network implementation. It has been proposed to carry the data representation process in the associative processor to the full "primal sketch" for a usefully large retinal array. At this level, the intensity values of an image are decomposed into a set of primitive features such as zero crossings, blobs, terminations and discontinuities, and others.