In the field of robotics, and the like, it is often desirable to provide the equipment with an optical input device which will allow the equipment to "see" what it is doing and make adaptive control decisions based thereon. Such "machine vision" applications can generally be classified in one of two general categories--vision to control movement of a portion of the device with respect to a target area and vision to control movement of the device itself with respect to its surroundings. A robotic assembly device moving an arm with respect to an article being assembled thereby is an example of the first type while a robotic vehicle moving across an area is an example of the latter. Determining the direction an object moves relative to the observer is used to disambiguate objects from the background frame of reference. To navigate effectively, the optical flow of objects in the environment, relative to the observer, is used by both human and computer vision systems.
In machine vision it is frequently necessary to determine the direction of motion of an object in the field of view with reference to the background. This is especially necessary for machines that are themselves moving, such as planetary rovers, and the like. This capability is required because such movement is used to infer the 3-dimensional structure and motion of object surfaces. Apparent movement is perceived when an object appears at one spatial position and then reappears at a second nearby spatial position a short time later. For example, when two similar lights flash asynchronously against a dark background at night, the observer "sees" an apparent movement of one light. The shift in the spatial position of contrast differences (i.e. light against a dark background) over a short time interval induces the perception of motion. The direction an object is perceived to move is judged relative to a background frame of reference. Figure-ground segmentation precedes the determination of the direction of movement. When navigating through the environment, objects are perceived to move relative to a textured stationary background. Both the spatiotemporal characteristics of the object and the background frame of reference are used to determine the perceived direction of movement.
Prior art machine vision systems have been designed in a machine-like fashion; that is, they take what amounts to a "snapshot" of the scene, delay, take a second snapshot, and then compare the two to see what changes have taken place. From those changes, the appropriate movement calculations are made. A human operator performing the same control functions, on the other hand, takes a different approach. The human's predictive approach is one wherein the scene is viewed in real-time and the movement is divided into relevant and non-relevant areas. For example, when driving an automobile, the driver sees thing immediately in front of the vehicle (the foreground), at a median distance (a middle active region), and in the far distance (the background). When maneuvering the vehicle along the streets, the driver is only interested in the median distance as it provides the information which is relevant to the movement of the vehicle through the streets. The background is irrelevant except as it relates to movement towards an ultimate goal. Likewise, the foreground area immediately in front of the vehicle relates only to the avoidance of sudden obstacles. Thus, the driver rejects data from the field of view that does not relate to the immediate problem being solved, i.e. steering guidance. There is also a constant prediction of future movement and correction for changes from the prediction. In this way, the driver is able to quickly and accurately perform the necessary steering function from the visual data as input and processed. At present, there is no machine vision system which operates in the same manner as a human operator.
Visual psychophysics research has uncovered several important properties that determine the direction a simple object is seen to move when viewed by a human operator. The contrast, position or spatial-phase, the spatial frequencies, and the temporal duration that characterize the test object and its background affect the direction an object is perceived to move relative to its background. When identifying patterns where a test sinewave grating is shifted in position relative to a stationary textured background composed of single and multiple spatial-frequency components, the visibility of left-right movement was found to be predicted by spatially-located paired Gabor filters (paired even- and odd-symmetric filters optimally tuned to a 90.degree. phase difference) summed across the background reference frame. The problem is to apply these discoveries so as to provide a similar ability to determine the direction of object movement in machine vision. The solution is to employ a computer-based, real-time system to process the image signals through paired Gabor filters, using the sums and differences to determine direction and, thereby, emulate the human response in a machine vision environment.